Research On Malware Detection Based On Improved Information Gain And LDA

Posted on:2018-02-23

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2348330518499189

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years, with the rapid development of information technology and the growing popularity of the Internet, more and more network information security problems have emerged, of which the more prominent is the issue of malicious software. The traditional malware detection method is mainly static detection method, which relies heavily on the characteristic code base. It is difficult to deal with new malware in the case that the number of malware is exploding today, and the static detection efficiency is getting low, which can not meet people’s needs. The dynamic detection method of malware by capturing the behavior of Windows API call is the hotspot in the field of research. The process includes several key links, among which feature selection is one of the important links in the process of detection. Based on the dynamic detection method of malicious software and the key technology of API feature selection, this paper focuses on the research of malware detection.First of all, samples of malware and non-malware from professional websites and forums at home and abroad are collected in this thesis. In the dynamic monitoring environment,the WinAPIOverride tool is used to capture the API call behavior log of the sample software, and the API call name in the API call log are extracted, which is regarded as the basic features of malware detection.Then, taking the traditional information gain feature selection method as the research object, this paper analyzes the shortcoming of the traditional information gain feature selection method in the Malware detection: the word frequency and the distribution of the class are not considered. Aiming at these problems, a new method to improve the traditional information gain feature selection by introducing relative word frequency and class dispersion index is implemented in this thesis.The results show that the detection effect of malware based on improved information gain feature selection is better than that of traditional information gain feature by comparing the experiment with the detection effect of malicious software based on traditional information gain feature selection.Finally, considering that the feature selection method based on mathematical statistics may lead to the shortage of feature redundancy problems, a method of combining the improved information gain with LDA is proposed in this thesis. In the feature selection link,the improved information gain is used to carry on the initial dimensionality reduction, then LDA model is used to learn and further the distinguishing categories of the thematic features are extracted. The results show that the improved information gain and LDA can achieve better results by comparing the experiment with the improved information gain feature selection of malware detection.

Keywords/Search Tags:

Windows API Call, Text Classify, Information gain, LDA

PDF Full Text Request

Related items

1	Research On The Malware Detection Based On Windows API Call Behavior
2	Re-calculation Method, Based On The Text Characteristics Of The Significance Of Information Gain Right
3	Design And Implementation Of Mobile Information Management System On Windows Mobile
4	The Study Of Chiniese Text Classify Base On Semantic Concept
5	Research And Application On Recommender Method For Ranking Lawyers Based On Text Mining
6	Research On Text Classification Based On Improved Information Gain And LDA
7	The Research And Implementation Of Text Classification Based On Meta-information And Optimization
8	The Research And Implementation Of Text Classification Based On Meta-Information And Optimization
9	Research On Term Weighting Approach Based On Information Gain And Entropy
10	Call Center Training Management System Design Based On Windows Azure