Font Size: a A A

The Research And Implementation Of Android's Malicious Code Detection Technology

Posted on:2018-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z G LiFull Text:PDF
GTID:2348330563452738Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The increase in Internet has led to the emergence of new types of the malicious code.The method of detecting malicious code is mainly divided into static detection and dynamic detection.The static method main analysis information about the program or its expected in its source code.The main advantage of static analysis is that it is able to detect a file without actually executing it and thereby providing rapid classification.Among them,anti-virus static detection based signature is very accurate,but it can not detect new malicious code;Dynamic detection is mainly the use of plugging technology,the implementation of procedures to track the behavior of malicious applications.But there are also shortcomings: First,it is difficult to simulate the malicious code to activate the environment,such as vulnerable applications malicious software vulnerabilities how to be activated;Second,it is not clear what time to observe the malicious activities of malicious software.Recently,the classification algorithm was successfully used to detect unknown malicious code.However,most studies only use byte sequences of binary code for executable files on Windows.This paper is applied to the detection of malicious code in Android's application.The operator is proposed as a feature classification and representation of malicious application of Android,and further uses the sequence model in natural language processing as a further representation of the feature,improves the value of the operator representation,and stores the frequency of the operator as a feature Key value,the use of features into the class before the treatment,this paper also combines the depth of learning framework Paddle characteristics of the advantages of fast training,saving time The main contents of this paper are as follows:(1)By writing scripts to decompile the Andrews application,summarize the virtual machine execution program files for each application,extract the operators in the code,and ignore the operands as features of the Android application.The n-gram sequence model is applied to the characteristics of the malicious code of Android application,and the code operator sequence is classified and described.At the same time,1,2,3,4,5,6 are set for different sizes.Out of the most appropriate size,so as to pave the way for the next experiment.(2)The size of n of the fixed n-gram sequence is expressed by TF and TF-IDF as the characteristic representation of Android application,and a good feature representation method is selected.In this paper,we prove that the effect is the same,and TF is chosen as the characteristic,Because TF-IDF with the new features will bring additional operations.(3)The decision tree,the random forest and the depth network are used as theclassifier of the detection model respectively.The parameters of the random forest algorithm are set by the multi-round test because the random sampling of the algorithm also avoids the problem of over-The network is trained by the parallel architecture of the depth learning framework Paddle,which is reconstructed using AutoEncoder to speed up the training speed.Finally,combined with the three classifiers for the test set to determine the results,to select a better training model as a prediction of unknown malicious code of the final model.
Keywords/Search Tags:The detection of malicious code, The code of operator, Natural language processing, Paddle, AutoEncoder
PDF Full Text Request
Related items