Font Size: a A A

Research On Classification And Identification Approach Of Malicious Traffic For Android

Posted on:2020-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:R ChenFull Text:PDF
GTID:2428330614972118Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Due to its open source,openness and seamless integration with excellent Google services,android has become the mobile intelligent terminal operating system with the highest market share.Mobile malware apps can eavesdrop on calls and steal private information,posing a serious threat to individuals,businesses and even countries.Detection of mobile malicious traffic has always been a hot issue in the research of major security companies.However,most of the current research work has not taken into account the imbalance of traffic data in real networks,resulting in the classification results can not well meet the actual needs.In addition,most existing mobile traffic classification algorithms adopt supervised algorithms,which require a large number of labeled samples for training,resulting in high training cost.In view of the above problems,this thesis studies the classification of mobile malicious traffic.The main research work and innovation points are as follows:Firstly,aiming at the problem of unbalanced traffic data in real network,an adaptive over-sampling method based on the principle of genetic reproduction is proposed.Calculating Mahalanobis distance,data set can be divided into two parts,female parent and male parent,and then in turn from the female parent and male parent paired samples,using the synthesis of new sample and the sample,continue to ancestors and produce offspring mating until meet in front of the preset sampling ratio in the number of samples under different sampling rate and then calculate the diversity of the data set measurement,finally get the optimal sampling rate and the corresponding optimal data set.Through experiments,we verify that this sampling method can improve the imbalance degree of traffic data set.When malware traffic only accounts for 5% of the total samples,the overall classification accuracy increases the most,up to 12.8%.Compared with the traditional oversampling method,the proposed oversampling method can improve the overall classification accuracy most when the number of samples is minimal,which is better than the traditional oversampling method.Secondly,in view of the problems of manual annotation dataset,high cost of manual annotation,and poor classification performance in the classification of traffic generated by new and complex malware,this thesis proposes an android malware traffic classification method based on Stacked Auto Encoders.Through automatic encoder unsupervised learning traffic data deep abstract characteristics,without a large number of manually marked samples,greatly reduce the training cost.Stack multiple Auto Encoders to form a Stacked Auto Encoders network,and improve the classification effect of android malware traffic through unsupervised feature extraction and supervised fine-tuning.Experimental results show that the Stacked Auto Encoders algorithm is superior to convolutional neural network and single-layer Sparse Auto Encoder network in accuracy and mis-alarm rate.
Keywords/Search Tags:Traffic classification, Stacked Auto Encoders, Genetic reproduction, Oversampling
PDF Full Text Request
Related items