| Wireless Sensor Network is one of many heterogeneous networks in the sensing layer of the Internet of Things.In recent years,with the wide application of Internet of Things technology in industrial control,environmental monitoring,warehousing and logistics and other scenarios,the data generated by various wireless sensor network access devices has also increased geometrically.Traditional wireless sensor networks face great challenges in resource management,business processing,and service quality,and due to limited energy,storage,and computing resources,routing optimization in wireless sensor networks is particularly important.Existing wireless sensor network routing algorithms have poor performance in terms of massive data transmission,multiple business services,and dynamic configuration and optimization.Deep Reinforcement Learning technology has powerful perception and decisionmaking capabilities,and has great potential in realizing intelligent routing optimization.The characteristics of software-defined wireless sensor network separation of numerical control and programmable solve the problem of insufficient computing and data resources in traditional distributed wireless sensor networks,which is conducive to the application of deep reinforcement learning technology in wireless sensor network routing optimization.This dissertation combines software-defined wireless sensor network technology with deep reinforcement learning technology to study the routing problem of software-defined wireless sensor network.The main contents are as follows:1.In order to meet the user’s demand for high-efficiency and low-latency routing,a routing optimization algorithm ROA-DRL(Routing Optimization Algorithm based on Deep Reinforcement Learning)based on Software Defined Wireless Sensor Network architecture was designed.The purpose of optimizing routing is achieved while satisfying the quality of service.Using software-defined network technology to obtain the global view information of wireless sensor networks in real time,based on this global view information,ROA-DRL combines the deep deterministic policy gradient algorithm with the priority experience replay technology,and samples according to the importance in the experience replay pool.Important empirical training to obtain optimal routing decisions.Simulation experiments prove that ROA-DRL has higher throughput and lower delay than RL-SDWSN and DCBSRP algorithms.2.Aiming at the problem that single-path routing protocols are likely to cause network congestion in the case of a sharp increase in data volume,in order to make network resource allocation reasonable,a multi-path routing algorithm MRAB-DRL based on deep reinforcement learning is designed under the SDN architecture(Multipath Routing Algorithm based on Deep Reinforcement Learning).MRAB-DRL obtains the global view information from the controller,uses the remaining bandwidth of the node,the remaining energy of the node and the delay as the reward function,so that the agent can iteratively update,make the optimal selection action,and obtain multiple optimal routes from the source node to the destination node.According to the Q value of each path,the traffic is divided into three sub-flows for transmission according to the Q value ratio of each path.Simulation experiments show that,compared with EER-RL and FASDN algorithms,MRAB-DRL has higher throughput,lower delay and longer network lifetime.Figure [18] Table [4] Reference [75]... |