Font Size: a A A

Modeling And Optimization Of Production Scheduling With Local Perception

Posted on:2019-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:J JiFull Text:PDF
GTID:2429330548485947Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Manufacturing productivity has been greatly improved with the development of economy.The concept of industry 4.0 and China Manufacturing 2025 indicate that the industry is stepping into a new era of intelligent manufacturing.On the one hand,due to the application of advanced manufacturing equpment,more companies are able to produce complex products.On the other hand,diversity of costumed requirements also increases the product complexity.To improve the efficiency of production and flexibility of adding costumed features for complex products,many manufacturers increase the production specialization and process products at many different processing sites.Production specialization make traditional centralized production scheduling methods unsuitable.To solve this problem,a decentralized scheduling method based on distributed reinforcement learning is proposed in this thesis.The main work of this thesis can be grouped into the following three parts:(1)Policy independent distributed reinforcement learning is studied and is applied to job shop scheduling in this thesis.A job shop scheduling model based on distributed reinforcement learning is designed,in which action set represents waiting job set,intermediate reward represents processing duration of jobs,remained processing time and job arriving time represent local state,and policy is represented by neural network.(2)Under the distributed job shop scheduling model,to minimize makespan,an optimization algorithm based on policy gradient is applied in this thesis.Each agent represents a processing machine in our algorithm.Each local agent improves scheduling performance by observing their local states and local states are fully observable.Each local policy is improved by gradient ascending according to makespan.By this method,the global joint policy can be optimized.(3)The distributed job shop scheduling model is optimized.First,to generate delayed schedule,busy-waiting and advanced job arriving notification mechanisms are employed in this thesis.The agents thus have some probabilities to choose idle action when there are waiting jobs.Second,we improve the reward rule so that the agents can optimize average delay time during learning process.
Keywords/Search Tags:job-shop scheduling, production specialization, reinforcement learning, policy gradient
PDF Full Text Request
Related items