Font Size: a A A

Research On Android Malware Detection Based On Data Feature

Posted on:2018-08-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P XuFull Text:PDF
GTID:1318330518494740Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and the broadband wireless access technology, the mobile Internet has appeared. The mobile Internet can meet the people's needs of enjoying Internet services anytime and anywhere. In the mobile Internet industry chain, mobile intelligent terminal is an important part and has become an indispensable necessity in people's daily life. Kinds of mobile applications are installed in the mobile terminals and provide a large number of mobile services for the users. Android is one of the most popular mobile systems, which is widely used in the mobile terminal by millions of users. Meanwhile,Android applications are favored by the majority of users, because they make life more interesting. However, malicious applications appear and bring some security issues, such as privacy leaking, malicious chargeback and system damage.In this dissertation, machine learning algorithms are used for Android malware detection. Based on the data characteristics, we mainly focus on the research on joint feature mining, feature weights adaptive computing and deep learning feature representation. The main work and results are as follows:(1) To solve the problem of the fine-grained and high dimension features reducing the efficiency of machine learning classifiers, we propose an Android malware detection model based on joint feature mining. Firstly, the application samples are decompiled, and fine-grained features are extracted. Secondly, a joint feature mining mechanism is built based on regularization and particle swarm optimization (PSO)algorithms. The mechanism can mine and utilize the classification information contained in the high dimensional features, and reduce the feature dimensions. Thirdly, the machine learning classifiers are used to identify the Android malware. The experiment results show that joint feature mining is useful for feature preprocessing, which can eliminate the irrelevant and redundant features, reduce the number of features, and enhance the efficiency and performance of the classifiers. Especially, the regularization and PSO are helpful to mine and utilize the classification information in the high-dimensional features.(2) Because not all features are equal when evaluating their similarity to Android features, we propose a malware detection model using machine learning classifiers based on feature weights, which are computed by Information Gain (IG) and PSO algorithms. The IG weights are evaluated based on the relevance between features and class labels,and the PSO weights are adaptively calculated to result in the best fitness(the performance of the machine learning classification model). Moreover,to overcome the defects of basic PSO, we propose a new adaptive inertia weight method called FCAIW-PSO that improves on basic PSO, based on the fitness and a chaotic term. The goal is to assign suitable weights to the features to ensure the best Android malware detection performance. The results of experiments indicate that the IG weights and PSO weights both improve the performance of machine learning classifiers and that the performance of the PSO weights is better than that of the IG weights.(3) The imbalanced datasets, where the ratio of the benign examples(majority class) is higher than that of the malicious examples (minority class), can make the machine learning classifiers bias to the majority class,so that the minority class examples are more likely to be misclassified. In order to solve the problem, based on the fuzzy set theory and synthetic minority over-sampling technique (SMOTE) method, we propose a new over-sampling method called Fuzzy-SMOTE. With the growth of imbalanced factor, Fuzzy-SMOTE generates more synthetic examples for the minority examples in the fuzzy region, where the minority examples have low membership degree to the minority class and are more likely to be misclassified. With the new synthetic examples, the decision boundary of the minority class is broadened, so that the classifiers are no longer biased to the majority class. The results of experiments indicate that Fuzzy-SMOTE achieves better performance on accuracy of the minority class and the overall accuracy of the whole dataset.In summary, we propose some usefull Android malware detection schemes based on the data characteristics. The results of the experiments indicate that the malware detection models and methods can improve the performance of Android malware detection.
Keywords/Search Tags:Android malware detection, joint feature, feature weight, imbalanced datasets, deep learning
PDF Full Text Request
Related items