Design And Implementation Of Instance Selection Based Ensemble Cross-project Defect Prediction Method

Posted on:2018-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:L P Wang

Full Text:PDF

GTID:2428330569495354

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Software defect prediction is one of the most important research areas in the field of software engineering data mining.The software defect prediction aims to mine the software historical repositories in the first step.Then,by analyzing the software code or the development process,we design a set of metrics which are related to the software defects.In the actual software development process,the project that needs to be predicted(i.e,the target project)may be a new project,or the project has less training data.Therefore,the problem of how to effectively transfer the knowledge of the source project to construct the defect prediction model for the target project is called the problem of cross-project defect prediction.In this thesis,we focus on the problem of defect prediction of the heterogeneous project,which assumes that the same metric is used for both the source and the target projects.We propose a Box-Cox transformation based ensemble learning approach named BCEL.This method mainly includes four stages: In the first stage,different formulas of the distance(including Euclidean distance,cosine similarity,correlation coefficient)are used for the instance selection to get different training sets from the candidate set;In the second stage,Box-Cox transformation is used to perform on these data sets for metric value normalization;In the third stage,a specific classification method(i.e.,Logistic Regression)is used to construct different base classifiers and analysis on whether prediction results diversity is performed;In the fourth stage,if the prediction result is diversity,by utilizing ensemble learning to further improve the prediction performance of the model.In the empirical study,the thesis mainly uses the AEEEM data set to evaluate the performance of the model using the F-measure metric.We choose three different baseline methods based on only a specific distance measurement.In particular,ED demotes the method of instance selection based on Euclidean distance;CS demotes the method of instance selection based on cosine similarity;CC demotes the method of instance selection based on the method of correlation coefficient.The experimental results show that the BCEL method can provide better prediction performance for cross-project defect prediction.On the crossproject defect prediction,the BCEL method is improved by 35.9% compared with the ED method,the BCEL method is improved by 20.5% compared with the CS method and the BCEL method is improved by 24% compared with the CC method.In addition,a prototype tool is designed and implemented by incorporating the ensemble cross-project defect prediction framework BCEL.

Keywords/Search Tags:

software defect prediction, cross-project defect prediction, instance selection, ensemble learning, empirical study

PDF Full Text Request

Related items

1	Research On Software Defect Prediction Method Based On Training Data Selection
2	Research On Cross-Project Software Defect Prediction
3	Research On Some Key Technologies Of Software Defect Prediction
4	Research On Software Defect Prediction Based On Ensemble Learning
5	Research On Cross Project Software Defect Prediction Based On Feature Transfer And Instance Transfer
6	Research On Automatic Software Defect Prediction Model Selection Based On Bi-Level Optimization
7	Search Based Semi-supervised Ensemble Learning Research For Cross-project Defect Prediction
8	Research On Cross-Project Software Defect Prediction Based On Multi-Source Transfer Learning
9	Software Defect Prediction Strategy Design For Imbalanced Data
10	Research On Software Defect Prediction Method Based On Fusion Feature Selection And Ensemble Learning