Font Size: a A A

Research And Implementation Of Android Malicious Application Detection Based On Deep Learning

Posted on:2019-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2348330545958482Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapidly development of mobile internet,Android malware detection has been an important research problem in recent years.On one hand,since the wide use of machine learning techniques,both quality and efficiency of malware detection have improved a lot.But some problems,such as relying too much on pre-defined features,high cost of development and application,difficult to improve performance and hard to scale out,remain unsolved.On the other hand,although deep learning methods have a more powerful data analysis ability,but studies of applying it in malware detection are still in original state.And it is hard to find a case in real-world applications which entirely uses deep learning methods to detect malware.In this thesis,I propose a novel end-to-end model,which is based on stratified convolution,to explore the feasibility and practicality of detecting malware with only deep learning techniques.And the model uses system call sequences as the analysis target,which are generated during an application's runtime.In order to improve the practicability of the model,I also design and implement an automated test procedure for Android applications.With this tool,I finally build a dataset which consists 10,000 normal and 4231 malware samples.In order to evaluate the performance of the model,I implement a method proposed by other researchers as baseline.Besides,I also implement an automated parameter testing procedure,which is based on grid search,to find out the best parameters.In summary,this thesis solves the problem of too short sequence length in traditional methods for effective analysis,and expands the effective analysis length from at most 5 to 10,000.The method in this thesis doesn't require any prior knowledge,feature engineering,data preprocessing,etc.The only need of the method is original system call sequences,which not only improve the generalizable ability,but also reduce development costs dramatically.Finally,the experiment obtains a detection accuracy of 97.3%,which is 6.44%higher than the baseline,and a F1 score of 95.36%,which is 9.75%higher.In addition,the training time of the best model in this thesis is about 400 minutes,and the response time in testing stage is at microsecond level,which totally meets the requirement of real-world applications.This thesis not only provides a new method for malware detection,but also has a high practical value.
Keywords/Search Tags:Deep Learning, Malware Detection, System Call Sequence, Stratified Convolution
PDF Full Text Request
Related items