| In recent years,the development of the mobile Internet has made smart phones an indispensable part of our lives and greatly facilitated people's lives.However,people have to face privacy and property security issues when using mobile phones.Due to the popularity of Android and its openness,it quickly became an attack target for criminals.Therefore,how to effectively detect Android malware has important academic and practical significance.The existing static analysis of Android malware is often limited to certain characteristics,such as string characteristics such as permissions,API(Application Programming Interface)calls,or structural characteristics such as function call graphs,data flow graphs,and so on.But as malware attacks continue to change,using only one signature for analysis often results in detection errors.Therefore,this thesis describes the Android application software from the perspective of multiple features,thereby improving the accuracy of detection.The main research contents and innovations of this thesis include:(1)An Android malware detection model based on opcode sequences is proposed.The model innovatively combines the function call graph and the opcode text sequence,reorganizes the opcode text sequence through the sequence of function calls,and builds the model using Long Short-Term Memory Neural Network(LSTM).On this basis,combined with sensitive functions,function call sequence,etc.,five characteristics were designed for comparative experiments.The experimental results show that the opcode sequence based on the function call sequence has obvious advantages over the unordered opcode sequence and the pure function call sequence.(2)An Android malware detection model based on opcode images is proposed.This model transforms the opcode sequence into a grayscale image and represents an Android application in the form of an image.At the same time,in order to improve the efficiency of training and detection,a lightweight convolutional neural network(CNN)is selected to implement classification of the grayscale image.Traditional methods that directly use byte code sequences to convert to code images often face the problem of inconsistent size of the generated image,and the image generation method designed in this paper can avoid this situation.There are two methods for converting opcodesequences into grayscale images.One is based on the frequency of the opcode pair,and the other is based on the TF-IDF value of the opcode pair.The experimental results show the effectiveness of malware detection using opcode images.(3)An Android malware detection fusion model based on deep learning is proposed.In order to describe Android applications at multiple levels,this paper proposes a fusion model based on opcode sequences and opcode images.Feature stitching is performed on the opcode sequence organized according to the function calling sequence after LSTM processing,and the feature code obtained from the CNN of the opcode image is used for feature stitching,and the stitched features are classified using a classifier,and finally the detection result is obtained.The experimental results show that the detection effect of the fusion model designed in this thesis is better than that of the single model. |