Font Size: a A A

Research On Host Penetration Test Of LAN Based On Partially Observable Markov Decision Process

Posted on:2023-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:B W JiaFull Text:PDF
GTID:2558306845499694Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
As a method of evaluation of cyberspace security key penetration testing environment faced by the network topology and critical assets is more and more complex,expend resources growing exponentially.To reduce the professional knowledge in the process of penetration test and artificial experience,improve the degree of automation,how to make use of new technologies such as artificial intelligence planning and decision making means to realize the automation of penetration testing,become the focus of domestic and foreign scholars and industry,has the vital significance.This paper mainly carries out research from the following two aspects:First,the hierarchical penetration planning based on Partially Observable Markov Decision Process(POMDP)is proposed to solve the state space explosion problem of POMDP in large-scale network penetration testing.At present,the penetration test method based on POMDP ignores the constraint effect of the network connection between LAN hosts on the state space,thus affecting the overall penetration test planning of POMDP.In this paper,host vulnerability exploitation and network connectivity are integrated in state space modeling.On this basis,four-granularity hierarchical network topology division including partition,subnet,host and state action is designed to achieve global penetration planning method based on recursive mechanism.and the global optimal solution of POMDP is obtained by dynamic programming.The operability of penetration test based on POMDP is improved to a certain extent.Second,the penetration planning of reinforcement learning based on partially observable Markov decision process is proposed to improve the automation degree of penetration test based on POMDP.At present,reinforcement learn-based penetration test automation methods ignore the dynamics of vulnerability exploitation time window and the uncertainty of vulnerability exploitation result,thus affecting the effective evaluation of state transition probability.In this paper,based on the hierarchical penetration planning of POMDP,reinforcement learning is introduced to construct automatic penetration and its optimization method.Double DRQN(DDRQN),an automatic penetration programming model for reinforcement learning based on Long Short Term Memory(LSTM)and Double DQN(DDQN),was proposed to solve the problems of instability and overestimation in training process.In terms of optimization,bootstrap sampling is used to carry out in depth exploration mechanism based on DDRQN model,so as to explore a better infiltration planning scheme,which provides certain reference for improving the performance of infiltration test automation.In view of the method proposed above,this paper uses Tensorflow to build LAN penetration simulation environment,and uses APPL toolkit to generate POMDP model to carry out sufficient experimental verification.In the hierarchical penetration planning based on POMDP,the experimental results verify the effectiveness of hierarchical decomposition.The running time of POMDP solver is about 63 seconds under the scale of 80 units,and the attack success rate is about 80%.In the infiltration planning of reinforcement learning based on POMDP,the experimental results prove that the running time of the DDRQN model is about 95 seconds on 145 hosts.The DDRQN model based on bootstrap sampling can converge earlier when the K value is 20 in the infiltration test,and the DDRQN model has good performance.
Keywords/Search Tags:Penetration Testing, Partially Observable Markov Decision Process, Bootstrap Sampling, Reinforcement Learning
PDF Full Text Request
Related items