Font Size: a A A

Cross-project Defect Prediction Based On Transfer Learning

Posted on:2016-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:B W LiFull Text:PDF
GTID:2308330476953473Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Defect prediction approaches have been widely used in software development process to predict and locate software defects. Early in the life cycle, projects may lack the data needed to build qualified predictors. One trend of the solutions tends to predict the defects by training data from the other projects, which is called cross-project defect prediction. Cross-project defect prediction not only can greatly ease the shortage of the data early in the project, but also can effectively avoid the delay and cost in extracting features while obtaining the ignored feature information from similar projects. Unfortunately, the performance of cross-project defect prediction is generally poor, largely because of the differences between the source and the target projects. Focusing on project variation issue, this paper proposed a cross-project defect prediction framework and two high accuracy approaches based on transfer learning.This paper first applied a universal cross-project defect prediction framework. The modeling approach of the framework allowed predictors to utilize a small amount of labeled target data to leverage the other projects to construct a high-quality classification model for the target project.Next, two cross-project defect prediction modeling approaches were proposed with feature-based transfer and instance-based transfer. The first approach was crossproject defect prediction modeling with feature-based transfer and schema evaluation(TrSchemaEval). TrSchemaEval approach transferred high-related data based on labeled target data samples and chose learning schemas for transferred data sets through performance evaluation. Models were built according to the evaluated schema to predict the target project defects.The second approach, instance-based cross-project defect prediction modeling, was proposed according to the single-source single-target limitation of TrAdaBoost. The approach contained two improved prediction algorithms: MergeTrAdaBoost and MultiTrAdaBoost. MergeTrAdaBoost merged and filtered multi-sources before constructing a model. MultiTrAdaBoost weighted each weak classifier in addition to each instance, and outputted a combined classification model.Finally, in order to demonstrate the effectiveness of the proposed approaches, this paper compared them with the existing algorithms for defect prediction. TrSchemaEval approach was compared with intra-project and inter-project approach. The results showed that the TrSchemaEval approach achieved close f-measure and better AUC than intra and inter-project approach. The experiment on comparing MergeTrAdaBoost and MultiTrAdaBoost with TrAdaBoost showed that both algorithms met the multi-source single-target scenario and could achieve better accuracy than TrAdaBoost. Finally, this paper compared the two approaches with Peters filter(the state-of-the-art cross-project defect prediction approach). The results showed that both approaches performed better than Peters filter.
Keywords/Search Tags:Cross-Project Defect Prediction, Transfer Learning, FeatureBased Transfer, Instance-Based Transfer
PDF Full Text Request
Related items