Font Size: a A A

Design And Implementation Of Vulnerability Database Based On Natural Language Processing

Posted on:2021-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:R K LiFull Text:PDF
GTID:2518306047986699Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network and information technology,the number of vulnerabilities has increased dramatically,which has brought great threats to society.It has become more and more important to collect and organize existing vulnerabilities and establish a unified vulnerability database.Existing vulnerability databases have different sources of vulnerability data.There are heterogeneity and redundancy between the vulnerability data,which reduces the quality of the vulnerability data and cannot achieve a unified description and retrieval of the same vulnerability.When comparing and merging heterogeneous vulnerability data from different sources,the similarity of the affected software fields has not been analyzed and studied.In order to solve many deficiencies of the current work,this paper designs and implements an intelligent fusion framework of vulnerability data similarity measurement to solve the problem of the uniform mapping of heterogeneous vulnerability data.The innovations and main work of this paper are as follows:(1)This paper designs and implements a distributed vulnerability data collection system,complete the vulnerability data collection of 11 well-known vulnerability platforms at home and abroad.Aiming at the problem that the traditional data collection method cannot complete the rapid and efficient collection of vulnerability data,based on the Scrapy information collection framework,a vulnerability data collection system with asynchronous requests and distributed downloads is designed and implemented,which increases the vulnerability data collection rate;The incremental update function of vulnerability data is guaranteed to ensure the real-time nature of the vulnerability data.In order to avoid frequent downloading of vulnerability data and data duplication,the Redis cache database was used to compare the similarity of vulnerability links.According to the characteristics of each heterogeneous vulnerability platform,different collection rules were customized to complete the collection of vulnerability data released by 11 authoritative vulnerability platforms at home and abroad.(2)This paper designs and implements the IFVD,an intelligent fusion framework for vulnerability data,and complete the fusion of heterogeneous vulnerability data.Aiming at the heterogeneity and redundancy of vulnerability data from different security vulnerability databases,a similarity measurement algorithm based on the name and version of the affected software of the vulnerability is proposed.Based on this algorithm,an intelligent fusion framework of vulnerability data is designed and completed.Through the fusion framework,the affected software attribute extraction of the heterogeneous vulnerability data of the three vulnerability databases NVD,Secunia and Security Focus were achieved.With reference to the CPE dictionary,the discrete mapping of the affected software versions was completed.Through the string matching,the software name was realized.By measuring the similarity with the version field,the vulnerability data similarity score is obtained,the fusion of heterogeneous vulnerability data is completed,and the IFVD fusion vulnerability database is formed.The experimental results show that the number of similarities between vulnerability data in Secunia and NVD is 83,252,the similarity rate of vulnerability data is 63.63%;the number of similarities between vulnerability data in Security Focus and NVD is 40,124,and the similarity rate of vulnerability data is only 56.74%.The number of vulnerabilities in the fusion database is 1.99 times the number of NVDs,which expands the number of vulnerabilities and improves the quality of vulnerability data.(3)This paper designs a vulnerability database management platform.The management platform framework including the data layer,processing layer,and presentation layer is designed.The data layer is used to store initial and fused vulnerability data,the processing layer is used to complete vulnerability data updates and interactive instructions,and the display layer has developed functional modules required for data statistics,data retrieval,data customization,etc.Implement the function of the corresponding module,complete the management,retrieval and sharing of vulnerability data.
Keywords/Search Tags:Vulnerability, Vulnerability Database, Data Collection, Similarity Measurement, Fusion Framework
PDF Full Text Request
Related items