Font Size: a A A

Research On Software Defect Prediction Based On Code Representation

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ZhangFull Text:PDF
GTID:2518306560454844Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As human life moves toward intelligence and modernization,software has become a important factor affecting life.Software defect prediction(SDP)can assist developers and testers in discovering potential defects in the project in advance,and rationally allocate resources,improving the efficiency of the development process and ensuring the reliability of the software.The traditional software defect prediction model uses specific metrics designed by experts(such as the number of lines of code,the degree of coupling of objects,etc.)as the characteristics of the software to analyze and predict the defects of the software.However,on the one hand,the measurement element designed based on expert experience is not universal;on the other hand,the software function is mainly realized by code.The code text of the software contains a large amount of semantic and grammatical information that is lacking in the measurement meta information.In order to solve the problems,this dissertation mainly studies defect prediction technology based on code representation.By analyzing the structural characteristics of the software text,this dissertation proposes a new type of code text representation model CB-Path2 Vec based on the Abstract Syntax Tree(AST)path;defines a representation granularity between files and program expressions,called program block;a structured expression called cross-block path is proposed.The path set is used as the code text feature,and it is encoded by Bi-directional Long-Short Term Memory(BiLSTM)network.In order to combine software text features and metric features at the same time,a composite software feature combining the two is proposed and applied to solve the problem of software defect prediction.This dissertation analyzes the importance of software defect propensity and defect quantity prediction.Aiming at these two aspects,a prediction model is constructed respectively.Experiments on 7 project data sets in the PROMISE data set show that the model proposed in this dissertation can effectively identify software defects,and is ahead of other excellent models in the current research field in terms of tendency and quantity prediction.
Keywords/Search Tags:Software defect prediction, Abstract Syntax Tree, Long and Short-term Memory Network, Representation Learning
PDF Full Text Request
Related items