Solving The 2D Strip Packing Problem With Deep Reinforcement Learning

Posted on:2021-10-13

Degree:Master

Type:Thesis

Country:China

Candidate:L Lai

Full Text:PDF

GTID:2518306191982749

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Two-Dimensional Strip Packing Problem(2D-SPP):given a strip with fixed width W and variable height H,and a set of rectangles{(w₁,h₁),(w₂,h₂),...,(w_n,h_n)},each rectangles with width w_iand height h_i,1?i?n,0?h_i,0<w_i?W,2D-SPP is to pack all rectangles into the strip without rotating and overlapping and the goal is to minimize the height of used strip.Exact methods and Heuristic algorithms are widely applied to solve 2D-SPP.In this paper,we propose a deep reinforcement learning algo-rithm to solve 2D-SPP.Deep reinforcement learning is an aspect of machine learning where an agent learns to behave in an environment,by performing certain actions and observing the rewards which it gets from those actions.Deep reinforcement learning is the product of a combination of deep learning and reinforcement learning.In this paper,we choose policy based reinforcement learning method as our algorithm,Pointer Networks as the policy function,stochastic policy as the output of our policy function and expected cumulative reward as our objective function.Pointer Networks is the model combines Seq2Seq with improved attention mechanism which works well on solving�output depends on the length of the input�problems,like 2D-SPP.During training,we apply Policy Gradient algorithm to calcu-late the gradient of objective function and Adam algorithm to update the parameters in Pointer Networks.During testing,Pointer Networks applies Beam Search algorithm to search the packing strategy.Constructive heuristic algorithm Bottom-Left-Fill packs all rectangles into the strip under the returned packing strategy.In this paper,we analyze the stability,feasibility and rationality of our algorithm through experiments.We also test the performance of our algorithm on classic 2D-SPP data set C and compare with other heuristic search algorithms.Experiments show that during solving large-scale problem,our algorithm obtains better or equivalent results with much lower time complexity and faster running speed than heuristic search algo-rithms.Our experiments show the advantage of deep reinforcement learning algorithm on solving 2D-SPP,provide a brand new solution for solving bin packing problems.

Keywords/Search Tags:

2D Strip Packing Problem, Pointer Networks, Deep Reinforcement Learning, Policy Gradient

PDF Full Text Request

Related items

1	Research On Hybrid Heuristic Algorithm For Two-dimensional Strip Packing Problem
2	Deep Reinforcement Learning Based On Policy Gradient Optimization And Its Application In Agent Control
3	Research On Agent Decision-making And Control Based On Deep Reinforcement Learning
4	Deep Deterministic Policy Gradient Based On Entropy Regularization And Regular Update
5	Research On Fast Policy Gradient Algorithms Of Reinforcement Learning Based On Adaptive Learning Rate
6	Optimization On Deep Reinforcement Learning Based On Policy Gradient
7	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
8	Hybrid Heuristic Algorithm For 2D Strip Packing Problem
9	Heuristic Algorithms For 2D Strip Packing Problem
10	Research On Off-policy Reinforcement Learning Algorithm