Font Size: a A A

Research On Android Malicious Application Identification And Malicious Family Classification Technology Based On API Call Analysis

Posted on:2022-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ZhangFull Text:PDF
GTID:2518306740994599Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
How to effectively analyze Android malicious applications is currently a research hotspot.Traditional Android malicious application analysis has the following problems:(1)The method of judging malicious behavior based on permission request is easy to misjudge the abuse of legitimate application system permissions as malicious;(2)The method based on official API(Refers to a set of callable functions that implement specific functions officially defined by Google)calls can be more accurate.However,this method will become invalid due to the change of official API call method caused by the update of Android version;(3)Family classification for malicious applications can effectively improve the efficiency of analysis and help track the evolution of malicious applications.However,the existing research on the classification of malicious families cannot solve the problem of accurately classifying malicious families with few samples.In response to the above problems,an Android malicious application identification and family classification method is proposed.This method realizes malicious application identification and family classification by feature recognition of the behavior pattern of the application calling official API calls.In the static analysis of the application,the behavior pattern of the application is constructed by extracting the call relationship graph of the userdefined function and the official API of the application,and the improved graph attention network classification method is adopted to realize the identification of malicious application;In the dynamic analysis of the application,the time series of official API calls are extracted by dynamically monitoring the execution process of the application.An improved shapelet analysis algorithm is used to identify malicious applications.In the task of malicious family classification,a comprehensive identification method of malicious families combining clustering and classification algorithms is proposed to solve the problem of effective classification of malicious families with few samples.The main work and innovations of this paper are as follows:1.In the static analysis of applications,in view of the high misjudgment rate of existing malicious applications against legitimate programs that abuse system permissions,and the failure of the discrimination method due to the upgrade of the Android system,a new method based on user self-discipline is proposed.A static analysis method for malicious applications based on the call relationship diagram of custom functions and official APIs.This method first extracts the call relationship diagram of user-defined functions and official APIs;then marks the official APIs that use dangerous permissions from the call relationship diagram,and extracts the behavior pattern characteristics of the diagram;Further adjust the weights of important nodes of the function call graph according to the sensitivity of the official API to use dangerous permissions;on this basis,the graph attention network classification method is used to realize malicious application discrimination.Tests show that this method has perfect classification capability.Its accuracy and F1-Score on Drebin and other data sets have reached 0.9342 and 0.9336.Compared with the traditional graph attention classification method,it has better classification performance.2.In the dynamic analysis of applications,in view of the high time complexity of the existing time series classification algorithm applied to the API call sequence analysis,an Android malicious application discrimination based on the local optimal random shapelets analysis is proposed.method.This method firstly triggers the activities with a specific name to control the operation mode of the application to obtain a stable API call time series;then proposes a random shapelets acquisition algorithm based on local features to extract the key features of the time series,This algorithm can effectively reduce the computational complexity of shapelets.Tests show that its accuracy and F1-Score on Drebin and other data sets reach 0.8873 and 0.8613.If static analysis is combined with dynamic analysis,they can be increased to 0.9721 and 0.9574.3.In the classification task of malicious families,in order to improve the poor classification performance in malicious families with few samples,a comprehensive identification method of malicious families based on the combination of clustering and classification is proposed.This method first adopts an unsupervised clustering method based on autoencoders,and finds out the potential families of sequences with malicious behaviors according to the characteristics of the API call sequence;then proposes a malicious family classification method based on graph machine learning.The malicious modules are obtained in the function call graph as features;finally,a comprehensive identification method of malicious families combining the clustering results and the classification results is proposed.Tests show that its accuracy rate and macro F1-Score are 0.9745 and 0.9710,which are greatly improved compared with other classifiers.At the same time,it performs better in the classification of malicious families with few samples,with an average accuracy rate of 0.925 in families with less than 40 samples.
Keywords/Search Tags:Android, Function Call Graph, Time Series, Malicious Application Detection, Malicious Family
PDF Full Text Request
Related items