Font Size: a A A

Research On Key Technologies Of Malware Dynamic Behavior Analysis

Posted on:2021-04-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L WangFull Text:PDF
GTID:1368330623982240Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With all kinds of system vulnerabilities have been found,the number of malware and their variants have increased exponentially,at the same time,large-scale APT attacks with different purposes continue to emerge,and attacks against personal computers and enterprise servers have shown large scale,systematism,intelligence,complexity,and other characteristics,network attack methods emerge in endlessly,and the means are increasingly rich,which brings great challenges to the traditional malware analysis and detection.Due to the continuous improvement of the self-protection capabilities of malware,the traditional static analysis algorithm cannot exhaust all possible execution paths of malware,and many behaviors cannot be obtained,so it is increasingly unable to meet the requirements of malware analysis.At the same time,the increasing number of packed and variant malware also presents new challenges to the dynamic analysis of malware.It is necessary to develop a general and efficient malware dynamic analysis system platform to capture different behaviors of malware and conduct behavior abstraction to obtain behaviors feature.In addition,in order to process the massive sample of malware captured,it is necessary to establish an accurate classification model of malware,analyze and judge suspicious programs,identify malicious features,and complete the tasks of accurate classification.In response to the challenges in the analysis and detection of malware,this topic has designed a dynamic analysis sandbox system of malware based on Cuckoo by fully analyzing the advantages and disadvantages of static and dynamic analysis of malware,creating a "simulated real environment" for the operation of malware.Let the malicious behavior fully exposed,capture all the API sequences in the process of its operation as well as the corresponding parameters,and conduct behavior abstraction.On this basis,through the research on deep residual network based on self-attention mechanism,the matching algorithm based on the minimum behavior graph,and the in-depth research of the recursive neural tensor network based on multi-layer semantic aggregation,this topic classifies the malware behavior accurately in order to improve the detection accuracy of malware.The main research results of this article are:(1)Aiming at the problem that static analysis tools have insufficient ability to capture malware behaviors,an abstract method based on behavior is proposed.The system can construct the environment required by the program runtime through the sandbox,and capture all the API sequence and parameter information during the program runtime.By constructing the auxiliary table,analyzing the dependencies between APIs,according to the system resources called during the program runtime,an abstract method based on malicious behavior is proposed to completethe behavior abstraction of the API sequence and construct the behavior feature vector of the sequence.The experimental results show that through the testing and validation of four typical APIs,the behavior abstraction results analyzed by the behavior-based abstraction method are completely consistent with the actual data,and the captured behavior can effectively characterize the behavior of the test program.(2)Aiming at the problem that it is difficult to accurately obtain the malicious behavior characteristics in the analysis of existing malware,a deep residual neural network malware classification algorithm based on self-attention mechanism is proposed.The algorithm learns from the idea of using the deep residual network model in the field of image recognition,introduces a self-attention mechanism,and learns the similarity between similar samples through training a large number of samples to automatically obtain differences that can characterize different categories.The experimental results show that the detection rate of the deep residual neural network model is significantly improved than the machine learning algorithm detection rate,reaching 91.5%;especially after the introduction of the self-attention mechanism,the detection rate is increased by 2.5% compared with ResNet-50 and the false detection rate decreased by 3%,indicating that the self-attention mechanism helps to extract more accurate features that are helpful for classification,thereby improving the classification accuracy of the algorithm.(3)Aiming at the problem that the traditional behavioral characterization cannot directly reflect the malicious attack intention of malware,a classification algorithm based on minimum behavior graph matching is proposed.Based on the malware behavior sequence captured by the sandbox system,the algorithm establishes a behavior relationship graph based on "minimum behavior",proposes a behavior graph matching algorithm,and constructs 82 "minimum behavior" graphs of common malicious behaviors to describe it intuitively.The experimental results show that the "minimum behavior" graph matching algorithm can detect most malicious behaviors,and its capture ability is higher than that of the common sandbox systems.Experiments with four major categories of malware samples for the recognition rate experiment.Except for the AutoRun category,the recognition rates for other types of malware are all above90%,with a high detection accuracy rate.(4)Aiming at the general machine learning malware classification algorithm,which is based on the characteristics of the program and without considering the actual semantics,a malware analysis model based on the combination of multilayer semantic aggregation and recursive neural tensor network is proposed.By studying the semantic aggregation relationship of malware,a multi-layer semantic aggregation model is proposed.Drawing on thecharacteristics of recursive neural network computing from bottom to top layer by layer,with the purpose of reducing parameter calculation,a malware analysis model based on the combination of multi-layer semantic aggregation and recursive neural tensor network is constructed.In order to test the actual performance of the model,a RNTN network malware analysis system based on multi-layer semantic aggregation was built in the experiment.Experimental results show that the detection index of this model is better than machine learning algorithms,which can improve the analysis and detection performance of malware,and provides a good solution for malware analysis.
Keywords/Search Tags:Malware behavior analysis, sandbox system, behavior abstraction, deep residual network, behavior graph matching
PDF Full Text Request
Related items