Font Size: a A A

Research And Implementation Of Software Defect Prediction Method Based On Source Code Semantics

Posted on:2022-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:J H LinFull Text:PDF
GTID:2518306569480964Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays,with the diversification of software products,the scale and complexity of software are also increasing exponentially,bringing new challenges to software testing and quality assurance.Software defect prediction builds a defect predictive model using historical code modules and then predicts potential defective modules in the current project.It helps developers to reasonably allocate limited test resources and optimize the test process,thus providing assurance for software quality.Traditional software defect prediction methods design software metrics related to the statistical characteristics of the source code as the input features of the defect prediction model.However,these handcrafted software metrics cannot adequately capture the grammatical structure and semantic information of the source code.Moreover,in the cross-project software defect prediction task,there are data distribution discrepancies between different software projects.Thus,the knowledge learned by the defect prediction model from one project is difficult to directly apply to another project.In order to solve the above problems,this thesis conducts in-depth research and optimization of the software defect prediction method based on source code semantics,and leverages deep learning to extract the semantic features of the source code from the abstract syntax tree,which is an alternative representation of the source code.Specific research work includes:(1)To solve the word embedding problem of nodes in the abstract syntax tree,an abstract syntax tree weight encode method is proposed combined with the continuous bag of words model,which sets different weights for the parent node and child node of the central node;(2)For source code semantic extraction problem,a within-project defect prediction model is presented after encoding the abstract syntax tree nodes,which extract the semantic features of the source code with not only the pre-order sequence of the abstract syntax tree,but also the in-order sequence exploiting the two-layer long short-term memory network;(3)To address the problem of data distribution differences in cross-project software defect prediction tasks,the domain adaption method in transfer learning is introduced to improve the performance of defect prediction model.The data distribution differences between the source project and the target project are reduced in the reproducing kernel Hilbert space to make the knowledge learned from the source project transferrable to the target project.Finally,combined with Web application technology,a software defect prediction system is designed and implemented.A series of experiments in the PROMISE datasets have shown that,compared with traditional defect prediction methods or defect prediction methods based on deep learning to extract source code semantics,the method proposed in this thesis can improve the performance of defect prediction and provide a new research direction for software defect prediction.
Keywords/Search Tags:software defect prediction, abstract syntax tree, deep learning, long short-term memory network, domain adaption
PDF Full Text Request
Related items