Font Size: a A A

Research On Malware Detection Technology Based On Deep Learning

Posted on:2020-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LuFull Text:PDF
GTID:2428330602450195Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,endless malware poses a serious threat to the security of computer systems.Malware destroys computer systems,performs undesired operations by users,steals confidential or private information,and causes enormous economic losses to social organizations and individuals.Therefore,researching malware detection technology has important significance and value.At present,most of the malware detection models based on machine learning proposed by researchers belong to the supervised learning model.The performance of supervised learning models depends on a large number of labeled samples.Obtaining a large number of labeled samples requires very expensive labor costs,and often a large number of unlabeled samples can be easily obtained,but the supervised learning malware detection model cannot effectively utilize unlabeled samples.In view of this situation,the thesis proposes a semi-supervised pre-training malware detection model that can effectively utilize unlabeled samples to improve detection accuracy.Convolutional neural networks can achieve higher levels of abstraction by combining lowlevel abstraction features.Firstly,a convolutional neural network is pre-trained as a feature extractor with a small number of labeled samples.Secondly,a large number of unlabeled samples are mapped to low-dimensional space for semi-supervised clustering and the clustering results are used as pseudo-labeling for unlabeled samples.Then,the feature extractor is pre-trained again with pseudo-labeled samples.Finally,for target task supervised training,this paper proposes wo models of end-to-end model and separation model.The endto-end model uses the labeled sample supervises training for the pre-trained convolutional neural network feature extractor after adding the SoftMax classification layer.The separation model uses only the convolutional neural network to extract features,and maps the labeled samples to a low-dimensional space to train a classifier.In order to further improve the semi-supervised pre-training malware detection model,the thesis introduces the generative adversarial networks into the semi-supervised pre-training malware detection model,and proposes a semi-supervised pre-training malware detection model based on the generative adversarial network enhancement.Firstly,benign samples and malicious samples with labeled datasets are separated and different generative adversarial networks are trained respectively.Secondly,the original data set is augmented by two generative adversarial networks samples,and the augmented labeled samples is used to pre-training the convolutional neural network.Then,based on the idea of Stacking's ensemble learning,this paper uses the convolutional neural network and two decision networks of generate adversarial networks as feature extractors of the first layer,and uses the feature extractors of the first layer to map the original labeled samples to lowdimensional Feature space.Finally,the classifier of the second layer is trained in lowdimensional feature space to detect malware.The thesis has tested the model based on two data sets.The experimental results show that the two models proposed in this paper can effectively detect malware under a small number of labeled sample supervision training.The semi-supervised pre-training malware detection model based on generative adversarial networks enhancement is better in evaluation index and stability.The precision of the test on the two data sets are 98.6% and 99.2%,and the recall rate are 99.2% and 96.8%,the accuracy rate are 98.4% and 98%.
Keywords/Search Tags:Malware Detection, Semi-supervised Pre-training, Ensemble Learning, Generative Adversarial Networks
PDF Full Text Request
Related items