| With the increasing use of computers,various computer network security problems frequently occur,and people have put forward higher demands on the security of network system.Offensive testing(such as red teaming,penetration testing,etc.)is a method for network security assessments by simulating real hacker attacks.Its assessment results are of great value to organizations on how to protect network system security.However,manual planning and execution of attacks by security experts has higher economic,time and labor costs.Automated offensive testing is currently a hot research topic,aiming at more efficient,low-cost and repeatable cybersecurity assessments.The decision-making and construction of automated attack plans is an important part of automated offensive testing,designed to replace security experts in the attack decision-making process.In response to the above problems and demands,this thesis will focus on the study of automated offensive assessments,offensive testing,automated planning,and reinforcement learning are combined,an automated offensive assessment system is designed and implemented.The main research work are as follows:Firstly,the overall design of the automated offensive assessment system is introduced,including demand analysis,feasibility analysis,overall architecture,design schemes,system functions and evaluation indicators.Secondly,based on the ATT&CK attack behavior knowledge base and model,the adversary’s attack behavior is summarized,including 14 tactics and 61 common techniques/sub-techniques,and the summarized adversary attack techniques/sub-techniques are coded,designed and verified.Further more,the finite state machine and reinforcement learning decision-making scheme of the automated offensive assessment system are studied and tested in the simulation environment.The simulation results show that the finite state machine decision-making scheme and reinforcement learning decision-making scheme are feasible in the simulation test process;at the same time,for the reinforcement learning algorithm of constructing attack plan in reinforcement learning decision-making scheme,the PG(Policy Gradient)algorithm can only learn the non-optimal attack plan,and the convergence speed is slow;the Q-Learning,SARSA and DQN(Deep Q Network)algorithms can learn the optimal attack plan,the QLearning algorithm has the fastest convergence speed,followed by the SARSA algorithm,the DQN algorithm is the slowest.Finally,in order to verify the feasibility and adaptability of the two schemes in the real network environment,72 abilities of the automated offensive assessment system are tested in the experimental network topology environment.The test results showed that the success rate of ability execution is 83.3%,and the ability of the automated offensive assessment system has good feasibility in the experimental test environment.Then,the automated offensive assessment system based on finite state machine scheme and reinforcement learning scheme is tested and evaluated.The experimental results show that the automated offensive assessment system based on finite state machine scheme and reinforcement learning scheme has good feasibility and adaptability in the experimental network environment. |