Font Size: a A A

Reinforcement Learning Based Maintenance Policy For Deteriorating Production Systems With Multiple Yield Levels

Posted on:2015-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1228330428984324Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Machines used in manufacturing are subject to deterioration with fatigue, wear and age. Operating a machine after it has deteriorated can be very expensive with higher production cost and lower product quality. Maintenance actions such as inspections, repairs or replacements are taken to prevent the machines from operating in an undesirable state. However, over maintenance may interrupt production, increase downtime and system maintenance costs. Therefore, the effective maintenance policies play a significant role in a manufacturing system with deteriorating machines. Although the maintenance problems of deteriorating systems have been extensively investigated, the product quality issue is seldom taken into consideration in the related literature. In realistic production system, the machines states are directly related to product quality levels, and there may be cases where the multiple-yield quality problems occur with the deterioration of machine states. Specifically, the machine might produce defective parts with a higher probability when it deteriorates into a worse condition. Therefore, the optimal maintenance policies can be determined by the product quality information related to machine states.In recent years, the maintenance policies of deteriorating machines in flow line systems have been an interesting research subject, in which a two-machine flow line system with an intermediate buffer,2M1B system in abbreviation, is a basic unit of modern manufacturing flow line system. However, most of the research is based on the strong assumptions, for example, the production time and maintenance time are unit times, and the maintenance resource is adequate and available at any time. So based on above assumptions, the maintenance decisions are lack of realistic basis. In view of this, this thesis investigates the preventive maintenance policies for deteriorating machines in a2M1B flow line system, based on the research of a single deteriorating machine with multiple-yield quality problems, and further attempts to analyze the impacts of limited maintenance resource on the maintenance policies. Finally, the solution algorithm for the system model is improved. The main work and results are as follows:(1) This thesis investigates a predictive maintenance methodology for a single deteriorating machine with multiple-yield quality problems, which mainly includes two stages. First, a continuous-time, discrete-state semi-Markov model is formulated to represent the deteriorating process of the machine, and the maintenance policy corresponding to each observed state is learnt via policy iteration based reinforcement learning algorithm. Then, the future maintenance time is estimated by re-simulating the system model using the learned maintenance policy. Through the example analysis, it can be observed that the time for taking the machine down decreases as the total number of parts increases, and also it decreases monotonously as the number of defective parts increases under a given value of the total number of parts. Moreover, the increasing subcycle gives rise to the decreasing of predictive maintenance time.(2) Based on the maintenance policies of a single deteriorating machine, the maintenance policies for a2M1B flow line system are investigated. The deteriorating process of the machines in the system is modeled as a two-agent semi-Markov decision making problem. A distributed multi-agent reinforcement learning algorithm, costs-sharing-RL algorithm is employed to solve the problem. Specially, in order to ensure the objective of minimizing the overall system average cost rate, the learning rule needs to establish a relation between the local decisions as made by each agent and the overall optimization goal, and then the optimal maintenance policies can be produced.(3) Further, the thesis investigates the maintenance policies for deteriorating machines in a2M1B system with limited maintenance resource. It is assumed that the preventive maintenance is imperfect due to the insufficient maintenance resource, and a continuous-time, discrete-state semi-Markov model is formulated to describe the deteriorating process of each machine with the buffer. The resource constrained distributed multi-agent reinforcement learning algorithm, RC-costs-sharing-RL algorithm is applied in the model. The experimental results prove that the proposed RC-costs-sharing-RL algorithm is superior to other algorithms such as the sequential preventive maintenance algorithm and the independent-RL algorithm, and the optimal maintenance policies of the system can be produced by the algorithm proposed by us.(4) From the angle of actual application, this thesis proposes a heuristically accelerated multi-agent reinforcement learning algorithm, HAMSL algorithm for maintenance of deteriorating machines in a2M1B flow line system. The goal is to improve exploration efficiency of the multi-agent reinforcement learning algorithm using heuristic function under the condition of minimizing the system average cost rate. With the support of statistical tests, the proposed HAMSL algorithm learns faster than other multi-agent reinforcement learning algorithms based on ε-greedy, neighborhood search, simulated annealing search and tabu search.
Keywords/Search Tags:2M1B flow line, multiple-yield quality problems, semi-Markov decisionprocess, preventive maintenance, resource constrained, reinforcement learning
PDF Full Text Request
Related items