Font Size: a A A

Research On Software Defect Prediction Based On Feature Selection And Instance Transfer

Posted on:2019-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330566480000Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the software industry,the scale of the software is constantly expanding,resulting in more software defects inevitably.However,the existence of software defects will bring huge losses to people's production and life,so people are beginning to realize the importance of software quality.If the defects hidden in the software can be found before the software is released,the test resources can be not only allocated reasonably and effectively but also be repaired concentratedly.Therefore,the software defect prediction has received people's attention.Software defect prediction is to build a defect prediction model by mining historical data,such as software development process,software code,etc.,to predict new project modules.Most of the current research has focused on familiar project defect prediction.However,it is often necessary to predict a completely new project or make a project predict with few marked data in actual development.So cross-project defect prediction came into being,which use existing data of other projects(the source projects)with rich tags to build a defect prediction model to predict the defects of the current project(the target project).This paper addresses the problems in the practical application of most models:(1)There are a lot of redundant or extraneous features in the defect data.(2)There is an unbalanced phenomenon in the defect data;(3)There is a big difference in data distribution between the source project and the target project.Based on the above issues,this paper presents two software defect prediction methods.(1)Software defect prediction method based on feature selectionBased on the problem,there is redundant or irrelevant features in the defect data,this paper proposes a software defect prediction method based on feature selection.Thismethod,starts from the source project data set,removes irrelevant and redundant features,then selects the corresponding features from the target project to solve the high cross-project defect prediction Dimension problem by the correlation analysis and redundancy analysis of features.Finally,the effectiveness of the method was verified by experiments.(2)Software defect prediction method based on instance migrationBased on the problem,there is class imbalance in software defect data and the difference in data distribution between source project and target project,a defect prediction method based on instance migration is proposed.Firstly,the method performs class imbalance learning and obtains randomly a plurality of class(ie,non-defective)samples in the source project data to obtain a plurality of class-balanced source project training sets;Then,the TrAdaBoost technology is applied to multiple source projects.The training set is respectively combined with the target project training set,and several sub-classifiers are be obtained;Finally,multiple sub-classifiers are integrated to obtain the final classifier.Experiments in AEEEM and Relink datasets show that this method can achieve better performance.(3)Finally,comparing comprehensively the two kinds of defect prediction methods proposed in this paper.In the comparison of the AUC values of the prediction results,the two methods in this paper have better performance.
Keywords/Search Tags:Defect Prediction, Feature Selection, Instance-Based Transfer, Class Imbalance Learning
PDF Full Text Request
Related items