The massive increase in the number of the IoT terminal nodes has driven the development of satellite communication networks,among which the low Earth orbit satellite communication network,characterized by wide coverage,low transmission delay,and low path loss,has become the most promising satellite communication carrier.How to improve the current communication performance and provide high-speed and stable communication services with limited resources is considered to be the main problem.At present,the civilian satellite Internet simulation platform is almost not open source,and the types of simulation frameworks used are different,and there is no longitudinal research on them.Therefore,the research and construction of the low-orbit satellite simulation system is one of the important contents of this paper.We explored the OMNet++ and NetworkX frameworks respectively and introduced the functional module construction and system logic implementation of the two frameworks in detail,evaluated and compared their performance in satellite routing tasks.We determine the simulation system with NetworkX as the underlying architecture.In terms of satellite routing algorithm,we studied the static routing algorithm in the LEO satellite communication network,implemented the path state-based shortest path algorithm in the network,and conducted a comprehensive evaluation from the perspectives of average end-to-end delay,packet loss rate,and calculation time,analyzing the deficiencies of the static routing algorithm.Secondely,we proposed an adaptive routing algorithm based on multi-agent deep reinforcement learning in the LEO satellite Internet aiming to reduce packet loss rate and overall network congestion.A partially observable Markov decision process was used for multi-agent system modeling,and the concept of neighbor node confidence is proposed.The MA-DQN algorithm is proposed based on the DQN model by adding confidence as an input to the state.The reward function design and maximum hop limit were used to make the routing strategy having delay performance closed to the shortest path and load balancing capability.This paper used a centralized training and distributed execution multi-agent training strategy to optimize the scalability and speed of the training process without increasing communication overhead.The time sequence decoupling of the data in the experience pool was used to break the time sequence correlation problem of the experience,and the dimension explosion problem of multi-agent input was solved by limiting the input dimension.We verified the performance of the MA-DQN adaptive satellite routing algorithm under different conditions.The experimental results showed that the proposed strategy is suitable for low Earth orbit satellite communication systems and performs well in terms of robustness and load resistance in the low Earth orbit satellite communication network. |