Research On Analysis Of Malware Based On Machine Learning And Intelligent Detection Technology

Posted on:2015-08-10

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X Liu

Full Text:PDF

GTID:1108330464971599

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

With the rapid popularization of the Internet, personal computers, and mobile computing platforms, malicious software are emerging and growing rapidly as well and those malicious ware threat computer usersâ€™ information security seriously.The author studied the difficult issues in malware behavior analysis.In behavior analysis, itâ€™s proposed to use time-series data to determine the malware variants. In order to solve the problem of the existed solution, the author designed SimHash-LCS algorithm. In order to preserve detailed information of malicious behaviors yet without reducing efficiency, SimHash algorithm concept is introduced to convert numerical value and the corresponding fuzzy equivalent algorithms is designed. In Series algorithm, the longest common sub-sequence is introduced which suits similarity evaluation between two sequences of greatly different length, and the algorithm can filter out the noise data. Experimental results show that the algorithm eventually is much much more effective than the mainstream algorithms, including the dynamic time deformation algorithm and the minimum edit distance algorithm,. The new algorithm can effectively judge malware variants. The algorithm can also be applied in other fields which need time series analysis, especially in the case of big different length of matching sequence, and with high performance requirements of noise filtration due to its particular advantages in these fields.BP neural network is introduced into the field of malware behavior classification to design appropriate data conversion algorithm by using experiments to find the best combination of the various operators and parameters in neural networksã€‚ Ultimately a suitable BP neural network was designed. Experiments show the network has a high classification accuracy than KNN NB algorithm does and it already has practical value to some extent.The paper also attempts to introduce SVM into the field of malware behavior classification. Firstly, it used 10-fold cross-validation method to determine the selection SVM algorithm; then it designed experiments to find each kernel function in SVM and the optimal parameters (C, g). In order to reduce the workload of the experiment, the author made theoretical analysis in the area in which optimal combination may appear, and then grid method and Genetic Algorithm was used to do initial search in this area, and then did refined search with genetic algorithms. Finally it proved the best parameter pair of SVM based on RBF kernel function. Experimental results show that SVM has an close accuracy of classification as the preceding BP neural network does.Finally, it shows that under the existing technical conditions, either the building of behavior library or behavior-capture can not guarantee the accuracy and adequacy of data. To solve the problem of partly missing data, the paper attempts to introduce the concept of gray systems. The gray system and Extreme Learning Machine are integrated to design gray Extreme Learning Machine models. Experiments were carried out to test the modelâ€™s anti-interference ability and other indicators. The experimental results show better adaptability of Extreme Learning Machine model than of ELM in malicious behavior analysis.In the construction of the malicious behavior library, the author gave a formal definition of malware and malicious behavior; Existing security tools are used to set up an integrated platform to track and analyze malware samples; self-designed XML tags are used to describe malicious behavior specifically. A relatively perfect malicious behavior signature library was established by above means.In surveillance application layer, the author also proposed a new method which mixed module injection and no modules injection. Ordinary module injection is given in order to be neglected by malicious ware; Then the module gets eliminated by itself, so that malicious software can not detect the presence of monitoring software. Solutions are listed out for some typical specific technical problems in application. This method proved to have good concealment and universality through tests.In kernel surveillance, a new technology called Secret Inline Hook is proposed, and this technique is optimized based on the SSDT Inline Hook. Its basic idea is to use the next layer functions in Hook SSDT table. The anti-monitoring by malware is almost an impossible task as it needs to traverse all functions in the lower layer as there are a large number of underlying functions, so the method is a way of well concealment. The author gave an example to demonstrate the application of this method, and proved its security and effectiveness through experiments.

Keywords/Search Tags:

Malware behavior analysis, Secret inline hook, Simhash-LCS, Machine learning, Gray extreme learning machine

PDF Full Text Request

Related items

1	Research On Feature Extraction And Classification Of Malware Based On Machine Learning
2	Application Research On Feature Extraction And Classification Of EEG Signal With The Method Of ELM
3	Study On Gray Extreme Learning Machine Prediction Algorithm
4	Research And Application Of Classification Method Of Robust Extreme Learning Machine
5	Research On Hybrid Hierarchical Extreme Learning Machine Algorithm
6	Research And Implemenation Of Malware Family Classification Method Based On The Extreme Learning Machine
7	Research On Object Detection Method Based On Extreme Learning Machine
8	Research On Behavior Detection And Source Tracing Of APT Malware Based On Machine Learning
9	Research On The Classification Of Stroke TCD Data Based On Extreme Learning Machine
10	Research On The Selection Of Hidden Layer Nodes For Extreme Learning Machine