Font Size: a A A

Research On Software Defect Prediction Method Based On Semantic Information Of Program Source Code

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhouFull Text:PDF
GTID:2518306569475594Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,the amount of software has been explosively increasing.Software is permeating our society.It's difficult to leave the support of software.As a result,the severity of the consequences caused by the lack of software reliability has also increased.As a means to ensure software reliability,software defect prediction can help developers locate possible problems,reduce testing costs and optimize the allocation of test resources.In the early research on software defect prediction,the program source code information is generally in a state of lack.Recent research has turned to exploit deep learning techniques to mine the information from the source code.These methods are novel but have some limitations:First,they usually encode the abstract syntax tree to a vector obtained by preorder traversal.In this way,the context information acquired will deviate from the original information in the tree structure.Second,although these approaches capture features from the abstract syntax tree,they feed the features to a deep sequential network,which weakens structural information.Finally,these methods are mostly supervised learning models and ignore the difficulty of obtaining labeled defect data in actual projects.In order to solve these problems,this paper conducts an in-depth study on software defect prediction.The main research content includes the following four parts:(1)An n-ary sequence encoding method built on the abstract syntax tree is proposed to obtain the context information of the node in the tree structure.(2)Aiming at the weaken syntactic information in the source code,a software defect prediction model via a long short-term memory network based on sequence and the tree structure is proposed.The encoded feature vector and the abstract syntax tree are fed into the model together to enhance the syntactic information.(3)Aiming at the general lack of source code information in unsupervised learning,a spectral clustering method based on n-ary sequence encoding is proposed to explore the effect of source code information in unsupervised software defect prediction.(4)Based on the two models presented in this paper,we design and implement a software defect prediction system.The system accepts the actual project code for training and visualizes the defect probability of the target software entity,hence helping developers understand the project's code defect distribution of the project.
Keywords/Search Tags:software defect prediction, program source code, n-ary sequence encoding, long short-term memory network, spectral clustering
PDF Full Text Request
Related items