Research On Malicious Code Detection Based On Integrated Strategy

Posted on:2022-09-01

Degree:Master

Type:Thesis

Country:China

Candidate:H F Yang

Full Text:PDF

GTID:2518306491966429

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet and fifth-generation mobile communication technology,network applications have penetrated into all aspects of daily life,and malicious codes such as ransomware,phishing emails and worms have brought great impact and damage to information security.However,as the self-survival capability of malicious codes gradually strengthens,the detection effect of single detection model on malicious codes is often poor.Therefore,the study of malicious code detection and malicious code family clustering technology based on integration strategy has important research significance and application value for maintaining information security and reducing the loss caused by malicious codes,and the main content and innovation points of the paper are as follows:1.A large number of malicious codes have been analyzed by static and dynamic analysis methods,and the string characteristics,file behavior,process behavior,registry behavior and network behavior of malicious codes have been systematically summarized and described.2.Two feature integration methods for malicious codes are proposed,including homologous feature integration and heterologous feature integration.Homologous feature integration is to extract features using different feature extraction algorithms for the same data source and then form a feature matrix;heterologous feature integration is to extract features using different feature extraction algorithms for several different data sources and then form a feature matrix.After feature integration,the effectiveness of the feature integration algorithm is measured by iterative verification with the designed classification model to confirm whether the integrated features have a positive feedback effect on the model,or by feature selection of the feature matrix and by determining the importance score corresponding to the newly selected features.This feature integration method can obtain more differentiated information from multiple original feature sets,while eliminating the redundant information generated by correlation between different feature sets.3.An integration strategy framework for malicious code detection is proposed.First,to address the problem that there may be large correlation between different base models in the multi-model integration strategy,a malicious code detection model based on weak correlation integration strategy is constructed to reduce the correlation between base models.Second,to address the problem of how to determine the weight of the base model,an accuracy-oriented weight determination method is constructed,while the variance deviation equalization strategy is used to solve the model jitter phenomenon and improve the overall stability of the model.Finally,the problem of poor effect of single detection algorithm is solved.4.A malicious code family clustering algorithm based on dimensionality reduction visualization is proposed.For the clustering analysis of unknown malicious codes,the optimal number of clusters k is determined by using the reduced dimensional visualization(t-SNE)method,but the original algorithm is not effective when k is very large,so the clustering initialization purification iteration improvement algorithm is proposed on the basis of the original algorithm,which makes the clustering algorithm more effective and thus more applicable.Finally,detailed experiments are conducted to analyze and validate the above mentioned algorithms,and the experimental results show that the algorithms proposed in the paper have a large improvement in performance compared with the traditional algorithms.

Keywords/Search Tags:

Malicious Code, Integration Strategies, Model Jitter, Dimensionality Reduction Visualization, Clustering Research

PDF Full Text Request

Related items

1	Research On Visualization And Clustering Of Standard Synthetic Biology Parts Based On Nonliner Dimensionality Reduction
2	A Perception-Driven Approach To Supervised Dimensionality Reduction For Visualization
3	Dimensionality reduction and fusion strategies for the design of parametric signal classifiers
4	Research On Key Technologies Of Malicious Code And Emergency Response In Communication Networks
5	Dimensionality Reduction Technique For Visualization In Wasserstein Space
6	Research On A Few Key Issues In Nonlinear Dimensionality Reduction Algorithms
7	Multi-label Learning Based On Dimensionality Reduction
8	Dimensionality Reduction Based On Manifold Visualization
9	Semi-Supervised Clustering And Dimensionality Reduction With Their Applications
10	Research Of Dimensionality Reduction And Clustering Based On Constraint Weight Learning And Dictionary Learning