Malware Identification Based On Fuzzy KNN And Visualization Analysis

Posted on:2020-11-12

Degree:Master

Type:Thesis

Country:China

Candidate:M D Tang

Full Text:PDF

GTID:2428330599964891

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The rapid development of the Internet has brought great convenience to the society,but the computer security issues have followed well.Malware is one of the important threats.The high-risk of malware has caused huge threat to individuals,organization,finance,military and even the country.Therefore,the study of malware has always been a research hotspot in the field of computer security.And with the rapid development of malware automatic generation technology and obfuscation technology,the types and quantities of malware have exploded.Traditional detection methods based on signature matching cannot meet the new requirements for security under the new situation.Machine learning and deep learning play an increasingly important role in the research of malware.The main work of this thesis includes:1,The Fuzzy KNN(FKNN)algorithm is proposed by combiling fuzzy set theory and KNN algorithm,and the algorithm is applied to malware identification.In the feature extraction phase,we first extracts the P E file structure information of the malware through static analysis,and then uses the fuzzy set theory to generate the fuzzy vector of the malware.Use the "maximum fuzzy region matching principle" to filter the interference of outliers,and calculate the Euclidean distance between the fuzzy vectors to find the k-nearest neighbors.In the classification phase,the reciprocal of index is assigned as weighting vote,which can better deal with the unbalanced data set.Finally,the class with the sum of the largest voting weights is used as the prediction label.By verifying on the public dataset ClaMP,the FKNN algorithm achieves a accuracy of 0.952,a recall of 0.977 and an AUC of 0.99,which is superior to Classical KNN(CKNN),Local Mean KNN,SVM and other comparison algorithms.2,A dynamic API call sequence visualization method is proposed,and combined with deep learning to complete malware classification.This method takes into account factors such as the type,time and frequency of the API being called during dynamic operation,and generates a feature image that reflects the behavior pattern of the malware.The convolutional neural network(CNN)is used to learn and classify the feature images,thereby indirectly achieving the purpose of malware classification.Experiments show that the method achieves a classification accuracy of 0.993,a recall rate of 0.993,and a FPR of 0.00085 in the classification experiment of 9 types of malware families.And with the increase of test samples,the time consume of classification phase is still maintained at the millisecond level,with high accuracy and high efficiency.

Keywords/Search Tags:

Malware, k-nearest neighbors, visualization, deep learning, static analysis, API call sequence

PDF Full Text Request

Related items

1	Research On Android Malware Detection Method Based On Dynamic And Static Analysis
2	Malware Detection Based On Deep Learning Of Image Features
3	Research And Implementation Of Malware Classification Based On Deep Learning
4	Research On Malware Visualization For Detection And Classification Based On Deep Learning
5	Research On Visual Detection Of Malware Based On Deep Learning
6	OpCode-Level Function Call Graph Based Android Malware Classification Using Deep Learning
7	Research On Malware Detection Technology Based On Multi-feature Fusion And Deep Learning
8	Research On Internet Of Things Malware Classification Algorithm Based On Deep Learning
9	Malware Detection And Classification Based On Deep Learning
10	API Call Sequence-based Malware Detection Method For Windows Platform