Study On Optimal Control In Communication Network Based On Reinforcement Learning

Posted on:2005-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:Y B Zhang

Full Text:PDF

GTID:2168360152468050

Subject:Communication and Information System

Abstract/Summary:

During the past decades, the vast increment in Internet and mobile application, especially the development of real-time multimedia applications has brought forward diverse QoS (Quality of Service) requirements to communication networks. To maximize the network performance, optimal control policy is always necessary to be studied.This thesis concentrates on the application of Markov decision process (MDP) and reinforcement learning (RL) in communication networks control, and investigates how to provide optimal policy in some wireline and wireless network control environment. The concepts of Markov decision process and reinforcement learning are introduced firstly. Then based on the definition of utility function in network, dynamic admission control issue is formulated as a discreted MDP and solved by Q-Learning algorithm. To overcome the "curse of dimension" problem, measured available bandwidth is used as the state space descriptor rather than number of flows as in most previous works. An adaptive method is also proposed to split the continuous bandwidth into discrete sets. This scheme is called measurement-based dynamic admission control (MBDAC). Simulation demonstrates that MBDAC can improve the total long-term utility of the network while meeting certain QoS constraints simultaneously.Next, we extend reinforcement learning to a continuous-state continuous-action context. A new learning algorithm with function approximation is proposed; its convergence is also proved. Based on this algorithm, we describe a new Active Queue Management scheme, Reinforcement Learning Gradient-Descent (RLGD). The key idea is to adjust the congestion measure and the performance measure separately and adaptively, without the demand of knowing the rate adjustment scheme of the source sender. The performance is evaluated and compared with REM and PI controller using NS-2 simulator. The result shows that RLGD scheme is much more responsive and robust to disturbance under various network conditions. RL method is also adopted in power control of Ad-hoc networks. Simulation demonstrated that power control can improve the whole network throughput significantly and the performance of our scheme approximates to that of the optimal algorithm.At last, the thesis tends to makes some efforts on the combination of information theory and queuing theory. It seems an alternate and promising way to solve communication network QoS problems. Some primary results are presented.

Keywords/Search Tags:

Communication Network, Optimal Control, Markov Decision Process, Reinforcement Learning, Dynamic Admission Control, Active Queue Management, Power Control, Information Theory, Queuing Theory

Related items

1	Learning Optimization Approach To Call Admission Control
2	The Research Of Channel Allocation Based On Markov Model In Wireless Communication Network
3	On Congestion Control For Computer Networks Based On Reinforcement Learning Theory
4	Active Queue Management Algorithm Based On Control Theory
5	On Active Queue Management Algorithms And Stabilities Based On Control Theory
6	Research On Active Queue Management Algorithm Based On Intelligent Control
7	Research Of Queue Management And Admission Control In Network QoS Control
8	Research On Several Algorithms Of Network Congestion Control Based On Control Theory
9	Active Queue Management Algorithm Based On Fuzzy Control Theory Is Studied
10	Control Theory-based Network Congestion Control