Font Size: a A A

Research On Automatic Malware Classification Techniques

Posted on:2018-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:2346330518993334Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Nowadays, the Internet has been deeply integrated into daily life in all aspects, and network security becomes an urgent problem to be settled.However, with the rapid growth of malicious code, the status of network security is not optimistic at present. China, one of the countries most seriously threatened by cyber security, has raised the security of cyberspace to its national security strategy. As the basis of network security,the automatic analysis of malicious code needs to be taken seriously.Based on the fact that malware of one family share significant homology, our paper advances research on automatic malware classification techniques using pattern recognition and machine learning.First of all, this paper proposes a malicious code classification method based on PRICoLBP. This method converts the binary file into a grayscale image, uses the PRICoLBP feature to describe the image texture, and then classifies the binary file by classifier. What’s more, this paper also presents a sparse representation to classify malicious code. This method converts the binary file to a bit vector, and uses the methods of pooling and random projection to reduce the dimension. And the sparse representation of the malicious code is generated by means of dictionary learning. This method makes use of feature learning and thus feature selection is not required. In addition, this paper further advances the aforementioned features. The improved method uses a sliding window, extracts the PRICoLBP feature from each window, and finally generates sparse representation of the whole feature corpus. Finally, this paper designs a hierarchy of malicious code classification system. As a result, the overall system can improve the logarithmic loss value to a large extent with a small loss of accuracy, and therefore is more practical.With the empirical observation, our methods not only have a strong ability to distinguish different malware families, but also have good linear separability. Compared with other existing algorithms, the proposed methods have novel advantages in accuracy and tolerance to confusion.
Keywords/Search Tags:malware classification, sparse representation, PRICoLBP, OpCode n-gram
PDF Full Text Request
Related items