Font Size: a A A

Vulnerability Text And Code Assessment Based On Pre-trained Models And Prompt Learning

Posted on:2024-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiFull Text:PDF
GTID:2568306932462284Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Software security vulnerabilities can cause serious threats and hidden dangers to the security and reliability of computer systems.In the face of increasing vulnerability reports and potential source code vulnerability threats,it is necessary to conduct vulnerability assessments,which will help prioritize vulnerability repair activities and improve software security.The main work of vulnerability assessment includes assessing the severity,exploitability and other characteristics of known vulnerabilities,determining their type information,or detecting unknown potential vulnerabilities from the source code level.However,vulnerability assessment based on expert knowledge often requires a large amount of domain knowledge and there is a time delay,so there is an urgent need for high-performance automated vulnerability assessment models.The emergence of pre-trained models has brought profound changes to deep learning.The proposals of fine-tuning paradigm and prompt learning paradigm provide efficient methods for using the knowledge from pre-trained models.From the perspective of vulnerability text description and source code,this paper aims to improve the performance of automated vulnerability assessment by combining the advanced performance of pre-trained models and the prompt learning paradigm.At the level of vulnerability text description,this dissertation focuses on predicting vulnerability characteristics based on pre trained models,prompt learning,and vulnerability description.This work aims to predict the severity and exploitability characteristics of vulnerabilities based solely on the standardized description of vulnerabilities in the Common Vulnerability Disclosure(CVE).Based on the pre-trained model,this work introduces the prompt learning paradigm,which adds prompt information to the input to guide and stimulate the pre-trained model,so as to achieve better knowledge mining and utilization.In the task of vulnerability severity prediction,by constructing various prompt templates and label word mappers(Verbalizers),combined with the prompt ensembling strategy,the performance of prompt learning surpasses traditional network models and the fine-tuning paradigm.In few-sample scenario task of vulnerability exploitability prediction,prompt learning shows excellent few-shot learning ability,surpassing the performance of fine-tuning paradigm.By combining transfer learning,the performance of prompt learning on this task is further improved.At the source code level,this dissertation focuses on predicting and classifying vulnerability codes based on pre-trained model,prompt learning and graph neural network.This work aims to extract semantic information of source code sequence through pretrained code model,extract structural dependency information of code through graph neural network,and use it for code vulnerability prediction and classification after fusion.This work introduces prompt learning into the pre-trained code model and makes adaptation adjustment,and uses the way of "soft label words" to adapt to complex tasks and the generation of fusion vectors.The introduction of prompt learning improves the semantic representation ability of the pre-trained model for code sequences on specific tasks.The graph neural network obtains the structural semantic vector by processing the code property graph,and merges it with the sequence semantic vector to form the multi-level fusion model of this work.The performance of the fusion model on multiple real project datasets exceeds that of a single model,proving the effectiveness of the fusion model proposed in this work.
Keywords/Search Tags:Vulnerability characteristics assessment, Code vulnerability prediction and classification, Pre-trained model, Prompt learning
PDF Full Text Request
Related items