Font Size: a A A

Research On Android Malware Classification Based On Deep Learning

Posted on:2021-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:X J PeiFull Text:PDF
GTID:2518306128475994Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of mobile technology,the Android application market is booming,and the malicious applications are constantly updated.Traditional malicious code detection systems based on machine learning methods have shortcomings such as low detection accuracy,high computational overhead or poor robustness,and are easy to lose potential deep information in the feature mining process,making complex malicious code behaviors unable to be fully described.In order to alleviate this situation,this paper combined the natural language processing technology,image analysis technology and malicious code detection technology,and proposed a feature extraction method based on multi-feature fusion.Furthermore,a deep learning-based Android malicious code detection framework for malware detection and family attribution is built by using a variety of deep learning model algorithms.The main research contents of this paper are as follows:1.We designed and implemented a feature fusion method based on the static analysis that can extract a variety of static features.This method provided a quick and useful feature extraction mechanism by screening out the information which is helpful to improve the classification performance of the model.Furthermore,the word embedding technology is used to map the behavior information contained in a malware into a vector space and extract the high-level semantic information.2.Since the traditional method is difficult to use the deep semantic information contained in a malware,we proposed and implemented a multiple semantic feature extraction method.Furthermore,the Attention,Independently Recurrent Neural Network and Dense Connected Convolutional Network are used to improve the Android malware detection accuracy,and the experimental results verify the effectiveness of the proposed method.3.We designed and implemented a lightweight detector with efficient and concise feature representation.As malware tends to adopt various obfuscations for evading anti-malware provider's detection.Based on the statistical analysis of the static features,the lexical feature is introduced.Furthermore,we combined the lexical features with the semantic features,which is useful for subsequent model learning.The experimental results show that this method can effectively identify the variant and obfuscated malicious codes,even though their codes are obfuscated by multiple obfuscations.4.A new multi-family classification algorithm based on parallel deep network is designed and implemented.The distribution of malicious family samples is seriously unbalanced and there are many categories.Thus,it is difficult for traditional machine learning algorithms to accurately classify the detected samples as the correct malware family.In this paper,based on the semantic and image features,the malware family classification is studied by combining the Attention,Independently Recurrent Neural Network,and Capsule Network.The experimental results show that the proposed method can reliably and accurately classify most malware families.
Keywords/Search Tags:Android, deep learning, malicious code detection, obfuscation scheme
PDF Full Text Request
Related items