Research On Penetration Test Path Discovery Based On Deep Reinforcement Learning

Posted on:2024-06-10

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Wang

Full Text:PDF

GTID:2568307157983259

Subject:Master of Electronic Information (Professional Degree)

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,network security issues have become increasingly prominent.Penetration testing,as a method to actively discover network security vulnerabilities,is of great significance to ensuring network security.However,traditional penetration testing methods have problems such as cumbersome manual operations and low efficiency.Aiming at these problems,this paper conducts research on penetration testing attack path discovery based on deep reinforcement learning.The main research contents include:(1)Aiming at the problems caused by complex attack graphs,such as excessively large state space,complex attack paths,difficulty in subsequent decision-making of the best attack path,and overly long attack paths that easily expose attackers,this paper first proposes a All Shortest Paths-Bidirectional Breadth-First Search(ASP-BBFS)algorithm.Based on the principle of bidirectional search,the algorithm constructs paths by traversing all possible parent nodes of the encountered nodes,so as to find all the shortest paths from the start point to the end point,and complete the simplification of the attack graph.Then apply the Markov Decision Process(MDP)to the attack graph,and use the Common Vulnerability Scoring System(CVSS)to add the reward and state transition probability information required in the MDP to complete the penetration Build test models.Finally,through experiments on two types of attack graphs,the experimental results prove that the ASP-BBFS algorithm is faster in calculating all the shortest attack paths,consumes less memory and is more robust than other attack graph path search algorithms.(2)Aiming at the problems that Deep Q Network(DQN)and its variant Deep Reinforcement Learning(DRL)algorithm are not efficient and difficult to converge in the penetration test attack path discovery scenario,this paper proposes a Normalized Entropic Dynamic Reward Scaled Clipped-Proximal Policy Optimization Algorithm(NHSC-PPO)is used to solve the optimal attack path.This algorithm improves the Proximal Policy Optimization(PPO)algorithm,and introduces advantage function normalization,policy entropy,dynamic reward scaling,and gradient clipping techniques.The NHSC-PPO algorithm first converts the network topology and vulnerability information into a state representation,and then through the continuous exploration and update strategy of the environment by the agent,finally finds the attack path with the highest reward value in the data set.Experimental results in various scenarios show that,compared with PPO,Double Dueling DQN(D3QN)and DQN algorithms,the attack path discovery of NHSC-PPO algorithm can more efficiently and accurately converge to the path with the highest reward value,and then Find the best attack path.

Keywords/Search Tags:

penetration testing, attack path discovery, attack graph, Markov Decision Process, deep reinforcement learning

PDF Full Text Request

Related items

1	Network Attack Path Discovery Based On Deep Reinforcement Learning
2	Attack Path Planning For Penetration Testing Based On Improved Drosophila Algorithm And Reinforcement Learning
3	Research On Attack Path Planning And Evaluation Method In Penetration Testing
4	Design And Implementation Of Penetration Testing System Based On MVC Framework
5	Research And Implementation Of Penetration Testing System Based On Reinforcement Learning
6	Research On Automatic Attack Framework Based On Priority Evaluation Of Virtualized Attack Modules
7	Research On Penetration Testing Method Based On Probability Attack Graph Model
8	Research On Host Penetration Test Of LAN Based On Partially Observable Markov Decision Process
9	Research Of Penetration Test Model Base On Minimal Attack Graph
10	Research On Automated Penetration Method Based On Attack Grap