Font Size: a A A

Reinforcement Learning And Its Applications In MAS-based Collaborative Conceptural Design

Posted on:2007-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S L ChenFull Text:PDF
GTID:1118360215998559Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning is an important research area in machine learning. The reinforcementlearning system learns the action selection policy from interactions with the environment, which isimproved through the evaluative feedback signal from the environment to different actions.Compared with supervised learning and dynamic programming, reinforcement learning does notneed teacher signals and the state transition model. Therefore, it has wide application future forcomplex optimization and decision problems. With great progress in theory and algorithms,reinforcement learning has become an effective approach for solving sequential decision problems.With the development of products design, collaborative design with experts from differentdomains and different regions has become a prevalent design method. At the same time, the rapiddevelopment of computer network technology provides powerful support for collaborative design.With the promotion of society requirements and information technology, collaborative design hasbecome a research hotspot. Nevertheless, most research interests about collaborative design arefocused on detailed design stage, but less on conceptual design. Conceptual design is the mostimportant and innovative stage in product design process, so research on collaborative conceptualdesign theory and technology is of great significance. Applying reinforcement learning approaches tosolve the problems in collaborative conceptual design becomes an important research topic.This dissertation investigates multi-step Q learning updating with multi-step information,Metropolis criterion of simulated annealing which can properly balance new knowledge explorationand current policy exploitation that the Agent is confonted with when selecting actions, and leastsquares reinforcement learning algorithms with fast convergence speed. The MAS basedcollaborative conceptual design system is constructed, and the reinforcement learning is applied totask scheduling and alternative optimization problems of that system. The objective is to deepen theresearch of reinforcement learning in theory and application, and to accelerate the development ofcollaborative conceptual design. The main contributions of this paper include:Firstly, the Metropolis criterion based multi-step Q learning algorithm is proposed. As standardQ learning is slow in convergence speed, we improve it from two aspects. One is to ameliorate theone-step update strategy, which can not make full use of experience information. The multi-step Qlearning algorithm updating with multi-step information is thus proposed. The other is to introducethe Metropolis criterion to multi-step Q learning, which can solve the problem of new knowledgeexploration or current policy exploitation that the Agent is confronted with when selecting actions. Secondly, the off-policy least squares Q(λ) and the on-policy least squares SARSA(λ) algorithmsand their recursive versions are presented. The problem of slow convergence speed and lowefficiency of experience exploitation in classical Q(λ) and SARSA(λ) is analyzed. Then the leastsquares approximation model of the state-action pair's value function is constructed according tocurrent and previous experience. A set of linear equations is derived, which is satisfied by the weightvector of function approximator on a set of bases. Thus the least squares Q(λ) and least squaresSARSA(λ) are proposed. The recursive versions of these algorithms are presented according torecursive least squares method. In fact, least squares methods construct the empirical model of thereinforcement learning problem, so they can accelerate the convergence speed.Thirdly, the integrated model of collaborative conceptual design is established on the basis of thecharacteristics of the design process. Then the hierarchical federal structure of collaborativeconceptual design system based on MAS is proposed. And the structure of Management Agent andDesign Agent is designed. Such functions as task scheduling, conflict resolution, alternativeevaluation and optimization, intelligent design are implemented in these two classes of Agents. Beliefcommitment which is suitable in conceptual design of complex product is proposed. Then theformalization of Agent is described and the Agent coordination mechanisms based on beliefcommitment are discussed in detail. The establishment of the collaborative conceptual design systemprovides the foundation for applications of reinforcement learning in it.Finally, the reinforcement learning based approaches are presented respectively for the taskscheduling and alternative optimization problems in management Agent. Task scheduling is animportant problem in collaborative design. Most existing methods have the drawbacks of slowefficiency and convergence to local optimum. This paper constructs the MDP model of taskscheduling problems and proves theoretically the feasibility of solving the scheduling problems byreinforcement learning. The scheduling algorithms based on Q learning and Q(λ) learning arepresented. The combination explosion problem exists in current alternative solving methods. So it ishard to evaluate each alternative to find the optimal one. This paper introduces the concept of distancebetween states, models the alternative optimization as MDP, and presented Q learning basedoptimization algorithm. The application indicates the effectiveness of this approach.
Keywords/Search Tags:reinforcement learning, Q learning, temporal difference learning, Q(λ) learning, least squares, multi agent system, conceptual design, collaborative design
PDF Full Text Request
Related items