Font Size: a A A

Research On Software Defect Prediction Method Based On Integrated Learning

Posted on:2020-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:2428330599960275Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the field of software engineering,software defect prediction technology can effectively assist software testing,guarantee the quality of software products and enhance the safety of software.This paper combines software metrics and Integrated learning algorithms to study defect codes.A heterogeneous ensemble algorithm based on imbalance rate threshold shift and an extreme random tree feature selection algorithm based on recursive feature elimination are proposed to predict the defects of software source code.The main contents are as follows:Firstly,the research status of software defect prediction is analyzed,and different types of defects in software security defect library are studied.The structural software measurement method is used to predict software defects.Combined with the integrated learning method,the problem of class imbalance in software defect prediction is studied.The feature analysis of the most dangerous memory security defects in software defects combined with search strategy is carried out.Secondly,in order to solve the problem of defect code imbalance in software defect prediction,a heterogeneous integration algorithm based on threshold shift of unbalance rate is proposed in this paper.By introducing the idea of heterogeneous integration algorithm and without affecting the distribution of defect data,the decision tree and logic regression algorithm are used as basic classifiers for model fusion,so as to improve the diversity of basic classifiers structure.Combined with the imbalanced rate of defect history data,the threshold shift can effectively improve the accuracy of the integrated algorithm to predict software defects.Thirdly,in order to analyze the characteristics of software memory security defects,this paper constructs the defect code from the function level,extracts the function class based on data stream analysis,and proposes an extreme random tree feature selection algorithm based on recursive feature elimination.By introducing heuristic rules,the relationship between memory security defect metrics and defect functions is analyzed,and the metrics with greater correlation with the defect function are obtained,whichimproves the accuracy of the random forest prediction defect function.Finally,the software defect prediction experiment based on C/C++ data set verifies the validity and correctness of heterogeneous integration algorithm based on decision tree and logistic regression and extreme random tree feature selection algorithm based on recursive feature elimination.
Keywords/Search Tags:software metrics, software defect prediction, memory security defects, heterogeneous integration, extreme random tree
PDF Full Text Request
Related items