Research On Malware Classification Based On Deep Learning And Multi-Feature Fusion

Posted on:2022-08-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Liu

Full Text:PDF

GTID:2518306749971859

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

According to statistics,since 2020,261603 malicious programs have been captured in China,and the consequences of malware attacks are very serious.For a large number of malware,the classification of malware is particularly important.Higher accuracy classification methods can help us better deal with malware attacks.With the continuous evolution and increasing types of malware,the traditional static classification methods and dynamic classification methods can not deal with emerging malware.Therefore,this thesis proposes a new classification model by combining multi feature fusion and deep learning.The experimental data show that the classification effect is more accurate than the traditional method.The main contents and work of this thesis are described as follows:? Decompile the malware executable file to generate.Bytes file and.ASM file,and extract the n-gram instruction features in the assembly file through the feature extraction algorithm.B2 M algorithm is used to convert binary files into gray images,and the texture features in gray images are extracted.A feature fusion algorithm is designed to realize the fusion of the two features.? Based on the long-term and short-term memory network model of LSTM,a classification model based on bilstm bidirectional long-term and short-term memory network is proposed in this thesis.The optimal parameters of bilstm model are obtained by using the control variable method.The n-gram feature,texture feature and fusion feature are respectively input into the bilstm model.It is found that the average classification accuracy obtained by using the fusion feature as the input feature is 96.8%,which is 0.7 percentage points higher than 96.1% obtained by a single feature.At the same time,through comparative experiments,the classification effect of bilstm model on malware family is higher than that of traditional models such as random forest,SVM and KNN.? According to the structural characteristics of CNN model,a network model based on CNN bilstm is designed,and the model is optimized by adjusting parameters.Through comparative experimental analysis,the classification accuracy of CNN bilstm model for malware family is 97.39%,which is 0.59% higher than that of bilstm model.Based on the original model,this paper designs and introduces an oversampling method and a loss function,which further improves the classification effect of the model by0.16 percentage points,reaching 97.55%.It is proved that the model has good performance for malware family classification.

Keywords/Search Tags:

Malware code, N-Gram, Grayscale texture, cnn, bilstm

PDF Full Text Request

Related items

1	Detection Techniques Of Malware Based On Code Texture
2	Malware Detection Based On Deep Learning
3	Android Malware Detection Research Based On The Feature Of Dalvik Instruction
4	Research On Grayscale Malware Image Classification Based On Convolutional Neural Network
5	Identifying malware using n-gram clustering metrics
6	Research On Malware Classification Based On Image Texture Features
7	Research On Key Technology Of Malware Detection
8	3D Facial Synthesis Based On Grayscale
9	Research On Android Malicious Code Recognition Based On Image Features
10	Static Virus Detection Based On Binary Opcode Semantic Optimization