The Research On Android Malware Detection Techniques In Supervised Machine Learning Classification Methods

Posted on:2015-11-25

Degree:Master

Type:Thesis

Country:China

Candidate:J X Li

Full Text:PDF

GTID:2308330461957935

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Mobile Internet technology, mobile devices are faced with more security challenges day by day. Since the first Android smart device was established in November 2008, Android system rapidly become the most widely used smart system in the world. Only two years after its being, Android acquired 48 percent world’s market share. Nowadays, according to a report in third quarter 2013, Android’s market share has upgraded to 81.3%, far more than other competitors like iOS. At the same time, as a developer’s platform, Android as well transcends iOS becomes the first choice of mobile app developers. By the year of 2017, the number of android devices will breakthrough 1 billion and the corresponding application downloading will exceed 5 billion. It means that more and more Android’s malware will penetrate into those app platforms and being downloaded by common users just like other benign applications.This article makes following contributions. First of all, facing with the low reusability and flexibility problems of current methods, we propose a general framework for Android malware detection based on machine learning algorithms, our framework has two advantages:(1) based on the design ideology of compilers, we divide overall process into independent sub-modules (feature extraction, feature selection, classification model generation, classification model validation), realizing the independent optimization and rapid assembling of each modules; (2) utilizing the database for the storage of Android samples’features, achieved the goal of one extraction, multiple using. Secondly, based on Android byte sequences, permission request and system calls, we propose three new representation methods for Android software samples named N-Byte, N-Permission and N-System. They dominate in three aspects:(1) they share simple extraction process; (2) they are not restricted to any single Android operation system version; (3) our experiments show that they can keep low false positive rate. Using "CantagioDump" as Android malware samples’source, using "WanDouJia" as benign samples’source, applying Fisher Score algorithm for feature selection, taking multiple kernel functions of Support Vector Machine, based on N-Byte and N-Permission feature extraction methods, this article generates corresponding classification models. The experiments show that N-Permission and N-Byte can keep high detection rate (over 90%) and low false positive rate (1%-10%) in three of the total four Support Vector Machine. This article proves that N-Byte and N-Permission have the ability to distinguish between Android malware and benign ones. Those machine learning oriented methods acquire the ability to detect malware in Android application markets, and with its expansibility, our framework "AMDetector" will benefit for the introduction of new Android sample representation methods, feature selection approaches and machine learning classification algorithms.

Keywords/Search Tags:

Android Malware, Supervised Machine Learning Classification Algorithms, Feature Database, Feature Selection, Support Vector Machine

PDF Full Text Request

Related items

1	Research On Android Malware Detection And Malware Family Classification
2	Research And Implementation On Android Malware Detection System Based On Machine Learning
3	Research On Multi-View Feature Selection And Semi-Supervised Support Vector Machine
4	Incorporating K-means, Triangle Area Support Vector Machine And Feature Selection Algorithms For Intrusion Detection System
5	Research On Text Classification Based-on Support Vector Machine
6	The Study Of Several Issues And Application In Statistical Pattern Recognition
7	Research On Network Traffic Classification Technology Based On Support Vector Machine
8	Study On Least Squares Support Vector Machine And Its Applications
9	Research On Models And Algorithms Of Semi-supervised Support Vector Machine
10	Research And Application On Machine Learning Methods For Health Assessment