Font Size: a A A

Research On Automated Identification Of Vulnerability Based On Multi-type Features Analysis

Posted on:2019-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z J DengFull Text:PDF
GTID:2428330563992537Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In today's world of agile software development,software developers mostly follow the development model of continuous integration and continuous delivery,and they are increasingly relying on the free open source libraries to assemble and build software quickly.However,due to the focus on fast release cycles as well as the lack of expertise,a significant amount of vulnerabilities may not be successfully identified,or may be patched silently,without public disclosure.These unidentified vulnerabilities can be exploited by malicious users and cause great damages.Therefore,there is an urgent need for an automatic vulnerability identification system to find the unidentified vulnerabilities in open source libraries and secure modern software development.For most software systems,bug-tracking systems are widely used to aid in the software development process.So,it is efficient to identify vulnerabilities in open source libraries by analyzing the bug reports in bug-tracking systems.However,the majority of existing automatic vulnerability identification methods towards bug-tracking systems only consider the limited features,leading an unsatisfactory precision and recall.The problem mentioned above can be solved by an automated identification of vulnerability based on multi-type features analysis,dubbed Security Bug Report Identifier(SBRer).Specifically,SBRer makes use of multiple kinds of information contained in a bug report which involve the non-textual fields of a bug report(meta features),the textual content of a bug report(textual features),and the code attributes of a bug patch file(code features).Based on these features,SBRer build an identification model to automatically identify the vulnerabilities via natural language processing and machine learning techniques.The experimental results show that SBRer with imbalanced data processing can successfully identify the vulnerabilities with the precision of 99.4% and the recall of 79.9%.Specifically,compared to existing automated identification models,SBRer improves the recall by 22.9%~175.5% while maintaining a high precision.
Keywords/Search Tags:Vulnerability Identification, Bug Report, Features Analysis, Natural Language Processing, Machine Learning
PDF Full Text Request
Related items