Font Size: a A A

Research On Container Stacking Optimization Algorithm Based On Deep Reinforcement Learning

Posted on:2023-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z T DuanFull Text:PDF
GTID:2568306614987219Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the storage yard of the container terminal,the stacking operation of containers is an indispensable part of the whole storage process.A reasonable stacking strategy can effectively reduce the container reshuffles,reduce the waiting time of wharf ships and land transport vehicles,and the operation cost of yard crane.It is of great importance to the efficiency of container operation and yard operation.The problem studied in this thesis is to store a set of containers in a certain number of stacks with capacity limits in a sequential manner,with the objective of minimizing the number of unordered stacked containers.In previous studies,simple stacking rules are difficult to guarantee the quality of solution,and the computational efficiency of exact and heuristic search algorithms is greatly affected by the problem scale.In this thesis,an optimization algorithm based on deep reinforcement learning is used to solve this problem,and the specific research work is as follows:The container stacking process,mixed integer programming model and its solution process are studied.On this basis,the reinforcement learning process to solve the container stacking problem is designed.Based on the the principle of reinforcement learning,the reinforcement learning modeling of container stacking optimization is carried out,and the basic elements of reinforcement learning model such as environmental state,action,reward,state transition and policy are designed for the container stacking process.According to the deep reinforcement learning framework,the policy network is designed,including the feature extraction network of yard environment and the stacking decision network.In order to improve the learning effect,a feature extraction network of yard environment based on multi-head self-attention mechanism and a stacking decision network based on multi-layer perceptron are designed.The policy network can effectively extract the correlation information between different stack states in the state matrix for decision making.According to the characteristics of different reinforcement learning algorithms and the problem studied in this thesis,proximal policy optimization(PPO)algorithm is chosen as the reinforcement learning training algorithm.Experimental results show that compared with existing methods,the gap between solutions generated by the trained stacking strategy and the optimal solutions is 17.36%in the small-scale problem(30 containers).The algorithm in this thesis outperforms commonly used rules such as Best fit and the beam search algorithm in the medium-scale problem(200 containers)and the large-scale problem(500 containers),and the computational time does not increase significantly with the increase of the problem scale.Besides,deep reinforcement learning algorithm can adapt to certain random changes.When the number of containers and the capacity of the stack(i.e.the highest stacking layers)change to a certain extent,the trained deep model can still produce reasonably good solutions that outperform stacking rules and beam search algorithm,showing its strong generalization performance.For the in-depth research and practical application of the stacking optimization algorithm in this thesis,a container stacking optimization software is designed.The function and data requirements of the software are determined through the requirements analysis.The software structure is designed based on the model-view-controller(MVC)architecture pattern.Through the running test,main functionalities of the software,such as engineering management,deep model training and container stacking optimization can function properly,which lays a good foundation for the practical application of the software.
Keywords/Search Tags:container stacking, environment state design, deep reinforcement learning, multi-head self-attention mechanism
PDF Full Text Request
Related items