Font Size: a A A

Metrics-Based Software Defect Prediction

Posted on:2015-10-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:X X YangFull Text:PDF
GTID:1228330434966088Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Metrics-based software defect prediction employs software metrics to construct prediction models, and thus the models can be used to predict defect information of new software modules. The predicted defect information can both reflect the quality of software modules and help to allocate software testing resources. The most common prediction goals include predicting whether new software modules have defects or not, and predicting an order of software modules according to the number of defects. Ac-cording to the prediction goals, present metrics-based software defect prediction can be categorized into software defect prediction with classification task and software defect prediction with ranking task. In this paper, we investigate both of them based on their existing researches.Software defect prediction with classification task is to predict whether software modules have defects or not, which can help developers to decide whether software modules should be tested. Prediction models for the classification task require both high detection rates of defect-prone modules and little wastage of testing resources (which is caused by erroneously predicting defect-free software modules). Nevertheless, the two objectives are always conflicting with each other. For different applications, the requirements for detection rates and the acceptable wastage are different. Most existing researches construct classification defect prediction models by optimizing a trade-off of the two objectives. The potential problem is that the obtained model might dis-satisfy the specific requirements for the two objectives. Therefore, we propose using a multi-objective evolutionary approach-Non-dominated Sorting Genetic Algorithm (NSGA)-II-based support vector machine (SVM), which combines NSGA-II and sen-sitive SVM, in order to simultaneously optimize the two objectives. Empirical stud-ies demonstrate that the multi-objective evolutionary approach can not only construct diverse models with different detection rates and corresponding least wastage that can better meet different requirements of applications, but also construct better models even when the trade-off of the two objectives is known.Software defect prediction models with ranking task give an order of software mod-ules based on their defect-proneness, and thus we can test software modules in the order (modules with more defects are tested first) according to the specific available testing resources. Predicting the exact number of defects in each module is often not necessary because it is the module-order that is important for helping to allocate testing resources. However, existing model construction algorithms are mainly regression and classifica-tion methods, which obtain defect prediction models by maximum likelihood estimation or least squares, focusing on the fitting of each sample. The potential problem is that a good model according to the fitting of each sample could give a poor result according to the ranking performance measures. Therefore, we suggest a learning-to-rank approach to directly optimize the ranking performance of software defect prediction models. Our empirical studies demonstrate that directly optimizing the ranking performance measure of prediction models could give a better ranking than optimizing the individual-based loss functions, especially when there are many software metrics.Software defect prediction process mainly includes two parts:data and model con-struction methods. Software metrics are essential for the quality of data. With more and more software metrics introduced, the value of numerous software metrics for construct-ing defect prediction models is questionable. Researchers have researched on finding out the metrics that are effective for constructing defect prediction models. However, the majority of existing researches on analyzing software metrics are for the classifica-tion task instead of the ranking task. Previous comparisons of different sets of software metrics for the ranking task could not give detailed information of the specific metrics that are effective for constructing models, and methods such as correlation coefficients may not reflect the effectiveness. Therefore, we conduct a comprehensive investigation of the effectiveness of metrics for building models with ranking task from the models’ prediction goal using the methods directly for the ranking task. Empirical studies show the rationality of these analysis methods and obtain some novel findings.
Keywords/Search Tags:Software defect prediction, Software metrics, Classification models, Re-gression modes, Metric analysis
PDF Full Text Request
Related items