| With the rapid development of the Internet,increasing malware threatens the security of the network.Malware family classification is an important security defense technology,and there have been a large number of family classification studies based on machine learning.Most of the existing family classification studies focuses on closed set scenarios,that is,the family to which each sample to be classified belongs has been included in the family set used for training.However,the actual malware family classification is faced with an open scene in which the sample to be classified may belong to a new family that has never been seen in training.At present,there are also some techniques for open set family classification of malware,but the classification effect and model training stability of these techniques need to be improved.Aiming at the problem of malware family classification in open environment,this paper firstly explores the model based on DCGAN and ACGAN models,and points out the shortcomings and improvements of the above methods.Secondly,this paper proposes a method for classifying the open set family of malware based on adversarial training: a generator,discriminator,and two classifiers are used to form a joint training network,and one of the classifiers is trained as an open set classification model.In this method,the lowdensity region data of the old family is generated by the generative adversarial network to simulate the new family samples,and training the classifier to generate different probability scores for the old family samples and the simulated new family samples.The characteristics of this method include: the use of two classifiers can overcome the impact of GAN model training instability on the open set classification model;by designing multiple loss functions and novel network training methods,the classifier can get good classification effect.In this paper,the adversarial training process and classification model of this method are described in detail.Based on the standard Big2015 data set and big MAL-99 data set,this paper conducted an experimental study to verify the effectiveness of the open set classification method in this paper.This paper compares the classification effect of our model with that of existing models through experiments.The experimental results show that our model has a higher AUROC value,classification accuracy and recall rate,and better model training stability.The necessity of each major component in our model is also demonstrated by ablation experiments.In summary,this paper proposes an effective open set classification method with training stability for malware based on adversarial training. |