Font Size: a A A

Research And Implementation Of Android Malware Detection Method Based On Co-training

Posted on:2019-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:S S ZhangFull Text:PDF
GTID:2428330593450191Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet,there are more and more scenes of the use of intelligent devices.The mobile devices with smart phones as the mainstream not only add a lot of entertainment to people's lives,but also make people more convenient in social,travel,shopping and reading.Android system has been popular since it was released because of its good operation interface and rich application.However,the security problem of Android system has attracted much attention.Since 2010,when the first Android virus Trojan-SMS was found,different kinds of Android malware showed a strong growth.Therefore,the research of Android malware detection methods is very important.In recent years,with the continuous development of machine learning technology,machine learning related technology can be applied to the detection of Android malware.However,there are some potential problems in the detection of Android malware based on machine learning methods.First,for newly emerging Android malware,fewer samples can be collected,making the classifier's insufficient learning leads to low classification accuracy.Second,in the standard multi view collaborative training method,two classifiers are trained through two fully redundant views.In the prediction of an unknown sample,if the prediction results of the two classifiers are opposite and the confidence is the same at the same time,it is difficult to give a more accurate result.In view of this situation,the following three improvements are proposed:First,it is put forward three features of Android application software are proposed to describe the software from different views.Through the research and experiment of a large number of documents,This article uses the permission application characteristics of Android application,the sensitive API call features and the Dalvik OpCode features to construct three views to describe Android application software from different angles,and the card method is used to filter some permissions that are weakly related to malicious software.In this paper,120 commonly used system permissions are selected as the set of privilege application features;33 sensitive API call sequences constitute a set of API features;it contains 89 OpCode feature sets of different kinds of Dalvik OpCode instructions.Second,three detection methods for collaborative training are presented.In this paper,three sub views are set up based on the privilege characteristics,API calling features and OpCode features of Android application software.And he optimal machine learningalgorithm is selected from four algorithms of Support Vector Machine,Naive Bayes,K proximity and Random Forest.Then,based on the cooperative training idea,a single classifier is used to mark the unknown samples,and the sample data with the highest confidence level is added to the training set of the other two classifiers,so that the performance of the three classifiers is improved synchronously.Third,in the prediction of unknown samples,this paper draws on the idea of integrated learning,and proposes to vote by three classifiers for the prediction results of unknown samples,and the final result is obtained by the thought of a minority subject to the majority.The experimental results show that the performance of each classifier is obviously improved after the collaborative training is carried out through three classifiers on the premise of the optimal classification of single view.Moreover,this scheme can effectively improve the detection accuracy of Android malware under the condition of fewer labeled samples.By comparing the experimental results with the two-view collaborative classification,it is clear that this scheme can compensate for the certain degree that the two-view collaborative classification scheme is difficult to give accurate results when the results are opposite and the confidence is equal.
Keywords/Search Tags:Android malware, machine learning, static analysis, multiple features, cotraining
PDF Full Text Request
Related items