Font Size: a A A

Research Of Hybrid Architecture For The Mobile Robot Agent

Posted on:2008-07-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:C H LiFull Text:PDF
GTID:1118360242973798Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
The workspace of mobile robots today has been extended to unknown real environment from the structured one. If the traditional design approach based on the cognitive model is applied in the actual environment, its' real-time performance, robustness and feasibility will face to strong challenges. And the requirements of the actual operating environment can not been met because the natural environment is dynamic, complex and uncertain. Brooks' behaviorism changed the traditional design method. It paid more attention to the adaptability and rapid response capability of the robot. The idea requires that the robot can build environment model through its own perception. This architecture belongs to the modern reactive intelligence in artificial intelligence (AI). But the whole system lacks the overall management and the real-time control system lacks autonomy and objective. It is applied only in unknown circumstances for some simple tasks. Thus the simple deliberative and reactive architecture can not meet the development and application needs of the robot. The control system architecture should reflect the basic characteristics of the behavior, and senior planning strategies and methods should be carried out easily. The deliberative-reactive-deliberative/reactive hybrid, the development course of the mobile robot architecture, is to meet this demand. One of its main contributions is to provide a fusion template for the deliberation and reaction. It can integrate the symbols and behavior function well together. The hybrid architecture not only has the real-time performance and adaptability of the reactive system, but also has the objective, optimization and autonomy of the deliberative function which is based on the symbols. At the same time, the function of learning and evolution can be added to the architecture. Thus the robots have a good active learning and adaptive capacity. Hybrid architecture has become today's popular research in the mobile robot agent architecture research field.From the perspective of autonomous mobile robot based on the Saphira single agent of Pioneer 3, this paper studies the behavior design, behavior coordination, conversion relations between behaviors and other issues of the hybrid architecture and builds a level hierarchical agent architecture including reactive behavior control layer, deliberative behavior control layer, and supervision and management behavior control layer according to the general structure of intelligent control system design. In accordance with the general principles of the intelligent control structure design, the evaluation component, namely the supervision layer, is needed for the structure. Its function is to supervise and coordinate the implementation performance for the reactive layer and deliberative layer. At the same time, learning function is blended into it. In this way, the robot can learn the adaptive behavior under the dynamic environment and build the prediction model to avoid the dynamic obstacles by training/learning the sampled data. This integrated hybrid architecture can enhance the adaptability of the mobile robot under the dynamic and unknown environment.Major work can be summarized as follows.1. A hybrid architecture for the mobile robot is designed based on the Saphira single agent. A supervision layer is added to the reaction/deliberation hybrid architecture of the Saphira agent. And the supervision, coordination and learning/evolution modules are established in the layer to supervise and coordinate the implementation of the reactive layer and the deliberative one. It also has the ability of learning and behavior prediction in the unknown environment. Here a difference value judger is designed to coordinate behaviors of the reaction and deliberation. The judger is set in the intersection module of the reaction/deliberation and saves the difference value between the actual moving direction of the mobile robot and the planed one by the deliberation function. If the difference is no more than 90°, the implementation of the hybrid planning is top-down intersection. Reactive function is triggered by the deliberative layer and executes the planned sub-goals. Otherwise if the difference is more than 90°, the execution is bottom-up. The global plan is triggered again by the reactive layer to acquire a better path. The impacts of the unmodelled obstacles to the planned path by the deliberative layer are simulated with the pure reactive architecture and the hybrid one respectively. The comparative results show that if the problems of the intercross performing between the reaction and deliberation are solved, the hybrid architecture can behave better performance than the pure reactive system when the robot is confronted with an uncertain circumstance.2. A simple global path planning method—the Steepest Descend Method (SDM), which is suitable to the grid map, is introduced according to the global path planning. Firstly, the environment around the robot is obtained using a laser range finder, and a grid map is built. Then SDM is designed to meet the needs of the shortest path and obstacle avoidance. The method uses the principle of the shortest line between two points as the heuristic information to propagate distances through free cells from the start cell and forms a different gradient around it. With the idea of the first search greed best, the shortest path back to the start point is traced by walking downhill via the steepest descent path from the goal cell. Finally, some simulations under different circumstances have been done with the algorithm. According to the four evaluation criteria, combining with simulation and the experimental results, the superiority of SDM are as follows:(1) The time complexity is very prominent. An optimal path can be found only after a search usually. So the method is applied to real-time programming;(2) It requires less storage space and is only related to the cell size of the map. The space complexity is O(n), and n is the cells number of the gird map.(3) The method is not sensitive to the complexity of the environment and can always find the optimal solution quickly;(4) According to the needs of different evaluation criteria, the algorithm can be extended by modifying the cells value given method and create different path search algorithms.3. A reactive behavior design method is introduced. It is achieved through deliberative behavior learning. The deliberative behavior is a static local optimal path planning, which is learned by Q-Learning (QL) method of the reinforcement learning. Control rules are formed after learning and are put in the reactive layer to be acted as a reflection of the implementation. Accordingly the reactive layer of the design is realized.(1) Firstly the input/output space is discretized. Then a Lookup-Q matrix M? whichsize is 11×192 is constructed to store each state-action pair.(2) According to the Boltzman equation, an action is chosen at the same state. With the progress of learning, the value of temperature T is gradually changed to alter the chosen probability of the actions. So the equilibrium is solved between the exploration and exploitation.(3) The reinforcement signal is studied carefully in this paper. It uses a non-uniform manner. The signal is departed into two parts based on the local optimal path planning. One part is used to show the obstacles distance away the robot and another expresses the extent that target trends. Different award-penalty values are given to the transfer state after action at the same state. In this way the convergence rate of learning is improved, and the optimal action is guaranteed.(4) The procedure of local path planning for the robot is modeled as an uncertain MDP. The different executed actions at the same state are learned based on the designed reinforcement signal. (?) values are updated according to the modified Bellman formula.(5) The state-action pair which has the maximum (?) value in each row is selected outafter QL. The optimal control rules are formed after the merger and are stored in the reactive layer to act as a reflection of implementation.(6) The method performance is tested in different environments under the control of the rules. The simulation results show that:①The algorithm has no 'symmetric indecision' phenomenon owned by the conventional fuzzy control rules;②Complexity of the environment has little effect on the planning performance of the algorithm when the relatively shorter path is planned;③Combining with the global path planning behavior of the deliberative layer, arbitrary length of the optimal path can be found out under the complex environment.(7) The algorithm is easily to be extended. While the running environment changes larger, QL continues. When the (?) value of the state-action in the Lookup table is nolonger the largest in the row, the only response of the need is to amend the corresponding control rule instead of having to redesign all control rules.4. A new hybrid dynamic obstacle avoidance algorithm under the unknown environment is introduced. It is a combination of the rolling plan, dynamic forecast and the reactive behavior for the static local optimum path planning. An effective simulation results are obtained also. Its main contents are as follows.(1) A dynamic prediction model is built. Camera is used to monitor dynamic obstacles in the workspace. And the robot collects the trajectory spots of the obstacles timely and builds up the prediction model based on the plot features.(a) When the dynamic obstacle is in a similar linear movement, a linear regression model is used to fit the latest sampled time sequence values based on Ordinary Least Squares (OLS) method. The regression model is converted to an autoregressive model to avoid the dynamic obstacles timely.(b) A Radial Basis Function Neural Network (RBFNN) is used to build the prediction model when the dynamic obstacle is in a nonlinear random movement. The model performance has been compared with a Back Propagation Neural Network (BPNN) forecast model which is normally used. The results show that RBFNN model has the higher forecast accuracy and faster learning rate. Combined with the designed N/M data division, the model is very suitable for systems of nonlinear time series prediction.(2) The rolling plan is combined to avoid the obstacles. Rolling window is set up within the detection scope of the robot. And only the dynamic obstacles into the rolling window are needed to be computed for avoidance. When the robot moves forward a step, the static and dynamic obstacles information in the window are updated each time. Then the dynamic obstacles forecast location is converted into instantaneous static obstacles for avoidance. This method not only can reduce the amount of plan computation using the real-time window rolling plan, but also avoid dynamic obstacles by the prediction model. At last it can find the optimal local path with the combination of the static obstacles avoidance method.(3) The method is simulated combining with the reactive behavior which is introduced in the fourth chapter. Simulation results show that the algorithm can avoid the dynamic obstacles, and also can find the optimal path.
Keywords/Search Tags:Mobile robot agent, hybrid architecture, global optimal path planning, reactive behavior, dynamic obstacle avoidance
PDF Full Text Request
Related items