Font Size: a A A

Solving The 2D Strip Packing Problem With Deep Reinforcement Learning

Posted on:2021-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:L LaiFull Text:PDF
GTID:2518306191982749Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Two-Dimensional Strip Packing Problem(2D-SPP):given a strip with fixed width W and variable height H,and a set of rectangles{(w1,h1),(w2,h2),...,(wn,hn)},each rectangles with width wiand height hi,1?i?n,0?hi,0<wi?W,2D-SPP is to pack all rectangles into the strip without rotating and overlapping and the goal is to minimize the height of used strip.Exact methods and Heuristic algorithms are widely applied to solve 2D-SPP.In this paper,we propose a deep reinforcement learning algo-rithm to solve 2D-SPP.Deep reinforcement learning is an aspect of machine learning where an agent learns to behave in an environment,by performing certain actions and observing the rewards which it gets from those actions.Deep reinforcement learning is the product of a combination of deep learning and reinforcement learning.In this paper,we choose policy based reinforcement learning method as our algorithm,Pointer Networks as the policy function,stochastic policy as the output of our policy function and expected cumulative reward as our objective function.Pointer Networks is the model combines Seq2Seq with improved attention mechanism which works well on solving”output depends on the length of the input”problems,like 2D-SPP.During training,we apply Policy Gradient algorithm to calcu-late the gradient of objective function and Adam algorithm to update the parameters in Pointer Networks.During testing,Pointer Networks applies Beam Search algorithm to search the packing strategy.Constructive heuristic algorithm Bottom-Left-Fill packs all rectangles into the strip under the returned packing strategy.In this paper,we analyze the stability,feasibility and rationality of our algorithm through experiments.We also test the performance of our algorithm on classic 2D-SPP data set C and compare with other heuristic search algorithms.Experiments show that during solving large-scale problem,our algorithm obtains better or equivalent results with much lower time complexity and faster running speed than heuristic search algo-rithms.Our experiments show the advantage of deep reinforcement learning algorithm on solving 2D-SPP,provide a brand new solution for solving bin packing problems.
Keywords/Search Tags:2D Strip Packing Problem, Pointer Networks, Deep Reinforcement Learning, Policy Gradient
PDF Full Text Request
Related items