To address the process route planning problem characterized by dynamic process requirements and intricate process data,a flexible process route planning method was proposed based on deep recurrent q-network (DRQN).Firstly, leveraging the structural advantages of the long short-term memory (LSTM) network, sequential data features were thoroughly mined to enhance the accuracy and stability of process route planning.Secondly, by integrating the robust dynamic decision-making capability of the deep q-network (DQN) with an adaptive adjustment strategy, the challenges posed by fluctuations in requirements and processing environments were effectively mitigated.Lastly, in response to frequent process changes, a "selective forgetting" mechanism was implemented to improve the response speed of process route planning during step process changes.Simulation results demonstrate that the proposed method can efficiently resolve the process route planning issue associated with part occurrence feature reconstruction.
QianJ H, ZhangZ J, ShiL L,et al.An assembly timing planning method based on knowledge and mixed integer linear programming[J].Journal of Intelligent Manufacturing,2023,34(1):429-453.
[2]
CheZ H, ChiangT A, LinT T.A multi-objective genetic algorithm for assembly planning and supplier selection with capacity constraints[J].Applied Soft Computing,2021,101:107030.
[3]
DemirH I, ErdenC.Dynamic integrated process planning,scheduling and due-date assignment u-sing ant colony optimization[J].Computers & Industrial Engineering,2020,149:106799.
ZhangH, WangW H, ZhangS S,et al.A novel method based on deep reinforcement learning for machining process route planning[J].Robotics and Computer-Integrated Manufacturing,2024,86:102688.
MnihV, KavukcuogluK, SilverD,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[10]
HausknechtM, PeterS.Deep recurrent q-learning for partially observable mdps[J].AAAI Fall Symposium Series,2015,15(6):29-37.
[11]
AchbanyY, FoussF, YenL,et al.Tuning continual exploration in reinforcement learning:an optimality property of the boltzmann strategy[J].Neurocomputing,2008,71(13/14/15):2507-2520.
[12]
AliR-K, JanarthananR, IdaM, et al.Replay buffer with local forgetting for adapting to local environment changes in deep model-based reinforcement learning[C]//Conference on Lifelong Learning Agents,Montreal:PMLR,2023: 21-42.
[13]
FosterJ, SchoepfS, BrintrupA.Fast machine unlearning without retraining through selective sy-naptic dampening[J].Proceedings of the AAAI Conference on Artificial Intelligence,2024,38(11):12043-12051.
LiuX J, YiH, NiZ H.Application of ant colony optimization algorithm in process planning optimization[J].Journal of Intelligent Manufacturing,2013,24(1):1-13.