Font Size: a A A

Research On The Open Set Recognition Model For Malware Organizations

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ChenFull Text:PDF
GTID:2428330647456990Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,APT(Advanced Continuous Threat)malware with a national or hacker organization background has become increasingly prominent,posing a huge threat to cyberspace security.In order to effectively respond to APT attacks,tracing the captured malware organization can deter attackers and help develop defensive measures.Therefore,the classification of APT malware has become an important security defense technology.The traditional malware classification techniques using machine learning are mostly based on the closed set assumption,that is,it is assumed that the categories of all samples to be tested are old categories that have been seen during the training process.However,the actual classification of APT malware organizations faces an open set scenario,that is,the organization of the sample to be tested may belong to a new organization that has never been seen during the training process.Therefore,it is necessary to study the open collection organization classification technology of APT malware,and classify the APT samples to be tested as an old organization that has been seen during training or as a new organization that has not been seen.This paper combines deep forest and convolutional neural network(CNN)to propose an open set recognition model for APT malware organization.This model first uses the deep forest to generate the first represented vector of the sample,and preclassifies the sample into the old APT categories.Then it generates a new deep representation(called the secondary representation)through the CNN.And according to the distance between the secondary represented vector of the sample and the center vector of the pre-classified organization,it is determined whether to accept the result of pre-classification.Finally,the sample receiving the result is assigned to the preclassified APT old organization,otherwise the new organization.The classification model in this paper mainly includes four modules: multi-grained scanning module,cascade forest module,CNN module and result determination module,which are used to generate the first represented vector,pre-classification result,secondary represented vector and the final recognition result of the sample,respectively.This article will show the process and details of this classification method.In order to verify the effectiveness of the proposed open set recognition model,this paper constructs a PE malicious sample data set with APT organization label for experimental research.The experimental results show that the model method proposed in this paper has higher AUC,Accuracy,Precision,Recall and F1 values compared with the existing APT open set recognition method.In addition,the open set recognition model proposed in this paper also has the potential to solve open set recognition problems in other fields.This paper explores the problem of this model used in the research of the malware family classification through experiments.The experimental results show that this model has transferability in other open set recognition fields.
Keywords/Search Tags:Open-set Recognition, Malware, APT organization, Deep Forest, CNN
PDF Full Text Request
Related items