Font Size: a A A

Vulnerability Classification Based On Text Classification Technology

Posted on:2016-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2308330479993299Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, the applications of computer is growing popularity, so, people’s production and life has not left this indispensable tool. And with this condition, computer and network security issues are arising. Computer and network security issues become the research focus in the field of information security in recent years. Operating system and application software security vulnerabilities in computers have become the culprit which belong to the computers and computer network security. And at the same time, how to effectively classify the existing vulnerabilities due to the number of vulnerabilities are increasing rapidly has become the bottleneck of computer vulnerabilities management.The main work of this paper is studying the techniques for vulnerability classification, using text classification technology as the support of vulnerability classification technology, therefore,vulnerability classification is based on the vulnerability ’ s text;Meanwhile, studying the machine learning theoretical knowledge about the information entropy in the depth, and using this as the support of research on the vulnerability classification based on SVM with fuzzy entropy feature selection algorithm and research on vulnerability classification based on binary tree with entropy multi-class SVM. At last,by collecting the related vulnerabilities text message which is belong tothe internationally accepted Common Vulnerabilities and Exposures(CVE) list and reference to the definition of vulnerability classification categories which was given by common weakness enumeration(CWE),using the combination of both as the support of experimental data. The main works are as below:(1) Give the definition of computer vulnerabilities and exploit the principle of classification; analyze the characteristics of text classification and the six steps in text classification and research the classification algorithms which are used in this article and belong to the machine learning in depths.(2) Combining the information entropy theory and fuzzy set theory in order to describe the concept of fuzzy entropy, and applying it to extract the text features of vulnerability, then proposed the vulnerability classification based on SVM with fuzzy entropy feature selection algorithm. Ascending each of the calculated based on fuzzy entropy algorithm, and taking smaller entropy characteristic composite the subset of features, then, after weighting the characteristics of the subset feature,building vulnerability’s vector space. Through the classify comparative experiment which is using the other two common characteristics which have good extracting results, thus reflecting the proposed feature extraction algorithm superiority and advanced.(3) Combining the advantages of classify entropy and binaryclassification, proposed the vulnerability classification based on binary tree with entropy multi-class SVM algorithm, and applying it to the vulnerability classification. And at the same time, in order to using the entropy to quantify the degree of the confusion with sample distribution of the vulnerability category, defining the vulnerabilities categories ’minimum sphere and extend-sphere, using them to describe the degree of vulnerability of confusion in the sample space of a category of aggregation degree and its surroundings sample different types of vulnerabilities.(4) In the end, collecting 3000 vulnerabilities as the experimental data from the CVE list, taking out 2500 vulnerabilities as the training samples which are used to train the vulnerability classification based on binary tree with entropy multi-class SVM algorithm, others are used as the test samples. Through comparing the experiments with the vulnerability classification based on KNN and on binary tree multi-class SVM algorithm, verifying the vulnerability classification algorithm which proposed by this paper is accuracy and advanced. Experimental results show that the average correct with vulnerability classification rate is up to 93.3%.Research findings by this paper can greatly enhance the management of vulnerabilities, such as the efficiency of repairing and anglicizing, and can minimize the needs of human resources in thevulnerability management, so it has a certain value of applied and research.
Keywords/Search Tags:Vulnerability classification, Machine learning, Fuzzy entropy, Class entropy
PDF Full Text Request
Related items