| Software vulnerabilities present complexity and diversity,posing a significant threat to the secure operation of computer systems.Vulnerability classification is a fundamental step in analyzing vulnerability attributes and is an important part of addressing software security threats.Currently,most researchers apply text classification techniques to the field of vulnerability classification,but computer terminology is scattered throughout vulnerability data.Vulnerability descriptions involve information such as the cause,location,version number,and results of vulnerability occurrence.Vulnerability code includes file structure,naming conventions,functions,and stack pointers that do not comply with security rules.Classification models for ordinary text have difficulty understanding these words,resulting in decreased vulnerability classification performance.This paper proposes two vulnerability classification techniques based on vulnerability trigger words and vulnerability code patterns.Relevant trigger words closely related to the vulnerability category are extracted from vulnerability descriptions to assist in classification.Vulnerability patterns are generated by clustering vulnerability functions to classify vulnerability code.The application of these techniques can effectively improve the accuracy and efficiency of vulnerability classification,providing more powerful support for vulnerability analysis and repair work.The main research content is as follows:(1)A vulnerability classification technique based on vulnerability trigger words is proposed for vulnerability descriptions.This method considers the relationship between trigger words and vulnerability categories,improving the accuracy of vulnerability classification.Firstly,4769 vulnerability data were manually analyzed to construct a vulnerability trigger word dataset.Two problem templates applicable for extracting vulnerability trigger words were designed based on the structural information of vulnerability descriptions.Then,Bidirectional Encoder Representation from Transformers questionanswering(BERT Q&A)model was trained to extract vulnerability trigger words from vulnerability descriptions.Finally,the vulnerability classification model is constructed based on Recurrent Convolutional Neural Net-works for Text classification(TextRCNN)using vulnerability trigger words.The results of experiments show that the F1-measure of this method reaches 80.8%,outperforming existing vulnerability classification methods.(2)A vulnerability classification technique based on vulnerability code patterns is proposed for vulnerability code.This method considers the similarity between code belonging to the same vulnerability category.Firstly,Abstract Syntax Tree-based Neural Network(ASTNN)code representation model is used to extract vulnerability features.Then,the XMeans clustering algorithm is used to cluster vulnerability code to generate code patterns.Finally,the Smith-Waterman algorithm is used to match vulnerability types.Experimental Results of experiments show that the precision,recall,and F1-measure values of this method reach 83.26%,82.04%,and 81.99%,respectively,outperforming existing vulnerability code classification methods.(3)A classification system for software vulnerabilities is built,applying the software vulnerability classification techniques proposed in this paper to actual development environments-This system includes four functional modules:vulnerability data analysis,vulnerability description classification,vulnerability code classification,and vulnerability data batch processing,helping developers analyze and understand vulnerabilities. |