Font Size: a A A

Detection Techniques Of Malware Based On Code Texture

Posted on:2020-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:2428330572467243Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years,the development of anti-detection technology has dramatically increased the number of malware.Malicious code and malware have become one of the main threats to network security.Security vendors are working on the detection of malicious code,the classification of malware occupies an important position in the field of malware detection and has become a hot topic in current research.While there exists many problems of low classification efficiency,poor detection accuracy,the feature extraction lags behind the increasing of viruses,the inability to detect unknown viruses and so on in traditional malware classification,This thesis proposes two methods to classify malware families based on the homology between them.The first method is using the "content + machine learning" algorithm.Extract features from malware static files and enter them into different machine learning algorithms for malware family classification.Feature extraction and fusion are the important part of malware analysis and detection research.This thesis starts from the virus file and its grayscale map to extract and fuse different features.First,it extracted two local features of Opcode N-gram and grayscale texture from virus file and its converted grayscale.Secondly,it adopted grayscale histogram as global features,Finally,it trained and learned single and fusion feature vectors based on the machine learning of random forest and make a comparison.However,feature extraction requires researchers' manual extraction,which consumes a lot of manpower.In order to realize the automatic extraction of malware features,this thesis proposed the second method,using deep learning algorithm to do a research on the malware classification.First using the B2 M algorithm to convert malicious code files into grayscale images.Then designing Single channel convolutional neural network(CNN)structure for automatic learning and mining features of grayscale images training set.Finally,the result of the network output layer is implemented by the softmax layer to classify the malware families.Experimental and analysis show that fusion feature plus random forest algorithm do a good job on classification of malware family with an average accuracy 99.59%.The second method that after the neural network trains the model,the test set is used to test the performance of the model,and a good classification effect is obtained.The accuracy rate is over 90.47%,which shows the great development potential of deep learning in the security field.Finally,the thesis in deep compares,analyzes and summarizes the two classification methods.
Keywords/Search Tags:Malware, Opcode N-gram, Code Texture, RF, Single Channel CNN
PDF Full Text Request
Related items