| With the deep integration of mobile communication technology and manufacturing industry,wireless industrial Internet(WII)has become the key technical support to promote the future industrial development.The application scenarios spawned by applying advanced communication technology to traditional manufacturing industries require higher quick perception and agile response ability of industrial systems,which makes the demand for information timeliness of industrial Internet increasingly prominent.It is important to study the dynamic scheduling and resource allocation strategies of industrial Internet to ensure high timeliness of information to improve industrial production efficiency.However,the battery capacity of most wireless devices in the industrial Internet is limited,and the shortage of wireless resources caused by the complicated production environment further restricts the ability of industrial wireless devices to send update packets.It is facing many challenges to meet the high timeliness requirements of industrial applications based on limited communication resources.At present,the rapid development of artificial intelligence technology has become an important driving force to promote the integration of emerging industries and traditional industries.It is of great research significance to apply artificial intelligence technology to industrial Internet,and to solve the resource allocation problem by using deep reinforcement learning(DRL)technologies,which are difficult to cope with traditional optimization algorithms.In this thesis,Age of Information(AoI)is used to quantify the timeliness of data.Considering the communication characteristics and performance requirements of industrial Internet,DRL and other methods are adopted to solve the resource allocation problems of industrial Internet based on timeliness.The main work of this thesis is as follows.First of all,it is difficult to achieve the optimization of high timeliness and low energy consumption requirements of industrial Internet at the same time.It is considered to maximize the data timeliness of industrial internet based on meeting the energy consumption requirements of equipment by reasonably designing resource allocation strategies.An optimization problem by jointly considering update scheduling and power allocation is proposed to minimize the average AoI under the constraint of average transmission power.Because of the existence of time expectation in the objective function and constraint conditions,this thesis uses the Lyapunov optimization method to decouple the long-term optimization problem into a sub-problem in a single time slot and proposes an algorithm based on Deep Q-learning Network(DQN)to solve the sub-problem.Finally,the effectiveness of the proposed algorithm is verified by simulation,and it is proved that the algorithm based on DQN can achieve the minimum average AoI based on satisfying the average power constraint.Secondly,because limited spectrum resources restrict sensor updates and lead to a backlog of data packets,this thesis puts forward the problem of industrial Internet resource allocation to minimize the weighted sum of AoI and transmission energy consumption under the constraint of limited communication resources.Due to the implicit expression of AoI in the objective function,it is difficult to solve this problem by using traditional optimization methods.Therefore,this thesis first models the original problem as a Markov Decision Process(MDP)and proposes an algorithm based on Double Deep Q-Learning Network(DDQN)to minimize the average AoI and transmission energy consumption by jointly optimizing scheduling and subchannel allocation decision.Finally,the convergence and effectiveness of the proposed algorithm are verified by simulation,and it is proved that the algorithm based on DDQN can significantly reduce the transmission energy consumption of the sensor while maintaining the approximate optimal AoI performance.Finally,according to the requirements of industrial real-time applications for information freshness,energy consumption,and throughput,this thesis proposes a joint optimization problem of update scheduling and power allocation to minimize the cost function under the constraint of long-term average throughput.To solve this problem,this thesis combines the Lyapunov optimization theory with DRL.Firstly,the long-term optimization problem is transformed into a deterministic subproblem in a single time slot by using Lyapunov optimization theory,and the long-term average throughput constraint is satisfied by ensuring the strong stability of the queue.Then,combining the advantages of the DDQN algorithm and the Deep Deterministic Policy Gradient(DDPG)algorithm,a novel reinforcement learning framework is proposed to directly solve sub-problem with hybrid action space.Finally,the effectiveness of the proposed algorithm is verified by simulation,and the changes in cost function and queue length with Lyapunov parameters are analyzed.Meanwhile,we analyze the trade-off relationship between average AoI and equipment energy consumption under different weight parameters. |