Font Size: a A A

Research And Implementation Of Android Malware Detection Method Based On Feature Fusion

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:W C LianFull Text:PDF
GTID:2518306050471954Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of mobile devices and Internet technologies,Android intelligent terminal devices have become an indispensable part of our lives.As an open source system,Andorid not only attracts many developers to develop legitimate applications,but also provides “opportunities” for malicious software developers with ulterior motives.Therefore,as the market share of Android smartphones has increased,malware targeting the Android platform has also shown explosive growth.Malware is a serious threat to users' data,privacy and money.Therefore,in order to provide a safe and comfortable environment for the majority of users,the research on Android malware detection has become a research hotspot in recent years.At present,researchers at home and abroad mainly use static analysis or dynamic analysis to extract features for Android malware detection.After,they directly use a machine learning algorithm to train the data containing the features.This method can not make good use of the rich semantic information of many features,and the detection performance is often low.In this thesis,in order to improve the performance of Android malware detection and effectively use these features,two detection methods based on feature fusion are proposed.This thesis not only extracts Application Programming Interface(API)and permissions as features,but also extracts code blocks as features according to the method of the code,and uses the chi-square test algorithm to further refine the features and remove redundant features.The first method is based on heterogeneous information network,which uses heterogeneous information network to represent applications,related apis,permissions and code blocks.Then,the meta-graph(or meta-structure)is used to describe the semantic association between the application and the API,permissions,and code.And use the principle of metagraph similarity to measure the similarity between applications,expressed by a similarity matrix.Finally,the support vector machine algorithm is used to train the similarity matrix as the kernel matrix to obtain the detection model.The second method is based on a multi-level training feature fusion model,which trains the three feature sets generated in the first stage of multi-model training,and then performs weighted fusion on the training results of each type of feature set.Finally,the fused data is used as the input feature of the second stage training,and the detection model is obtained after the second stage training.This model adopts the idea of stacked generalization model to perform multi-stage learning,so as to improve the generalization ability of the detection algorithm,avoid the shortcomings of the single classifier in some aspects,and improve the prediction ability.The experiment analyzed 40,394 applications,including 28,843 malicious applications and11,551 benign applications.Through the static analysis technique,we extracted a total of104,851 API calls,597 permissions,and 176,8337 code blocks.Experimental results show that the feature fusion detection model based on heterogeneous information network can achieve the highest detection accuracy of 97%,and the feature fusion model based on multistage training can achieve the highest accuracy of 99%.
Keywords/Search Tags:Android malware detection, fusion feature, static analysis, machine learning
PDF Full Text Request
Related items