Font Size: a A A

Malware Variants Detection Using Density-Based Spatial Clustering With Global Opcode Matrix

Posted on:2019-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z J NiuFull Text:PDF
GTID:2428330545950665Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,malware has become one of the most important threats to Internet Security and harms the computer system of users.In the past decade,the number of malware has increased rapidly.It poses a serious threat to the security of computer network and system.In the face of the spread of computer malware and the increase of malware variants,it is extremely important to study new malware detection.In recent years,malware detection methods have been proposed in succession.Traditional malware detection methods based on characteristic code are used to generate the characteristic code library of malware by reading the binary sequence of malicious samples,and the contrast results of new samples and characteristic code bases are detected for the known malware,but this method can not detect evil.The variant of Italy software and the obfuscation of malware.Therefore,some scholars use machine learning algorithms to detect malware and measure the similarity of the features to detect malware variants effectively.But at present,many methods of malware detection based on machine learning algorithms consume a lot of training and detection time and spend a lot of network resources when extracting the features.Aiming at the shortcomings of these methods,in this paper,we propose a malware detection method based on global two-dimensional operation code matrix and densitybased clustering.This method first decomposes the training set of malware sample into assembler code,then extracts the sequence of operation code,uses N-gram algorithm to transform into two-dimensional operation code sequence,constructs two-dimensional operating code matrix by using informat ion gain and so on,then uses the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)to gather the two-dimensional operating code matrix of the training set.The class is clustered,and then the similarity between the two dimensional operating code matrix and the cluster matrix of the test set is calculated.Finally,the test samples are judged to be malware by comparison of the similarity.The main contributions of this paper are as follows: We extract the global operating code sequence,without losing the behavior information of malware,ensuring the accuracy of the detection process,using clustering algorithm to extract common behavior patterns,thus reducing the search space when matching these behavior patterns,so that the training time and detection time are greatly shortened.The experimental results show that the proposed malware detection method can reduce the time cost while the accuracy rate is not lost,the accuracy rate is up to 90%,the training time is each 0.045 s,and the detection time is each 0.46 s.Compared with other detection methods,it has obvious advantages in terms of accuracy and time overhead.Today,with the growing scale of malware,it has broad commercial value.
Keywords/Search Tags:Malware Detection, Opcode Sequence, Opcode Martix, Information Gain, Density-Based Clustering, Cluster
PDF Full Text Request
Related items