Font Size: a A A

Software Defect Prediction Based On Feature Selection And Ensemble Learning

Posted on:2018-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:H ChengFull Text:PDF
GTID:2348330515966762Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the expansion of software system scale and its application scope,more and more product failures caused by software defects.Exploring reasonable methods to improve software quality becomes an important research direction in software engineering.The method of machine learning and data mining can be used to predict whether there are defects in the software module,so as to further locate the defects and improve the quality of the software.The key to software defect prediction is to extract the features of software modules,and then use the features to predict the software defects.However,the contribution of each feature to defect prediction is different in practical applications.In order to reduce the computational complexity and improve the prediction accuracy,feature selection is used in defect prediction.In addition,by comparing the performance of several classifiers,the method of ensemble learning is introduced into software defect prediction to improve the prediction accuracy in this paper..The main contributions of this paper are listed as follows:First,a feature selection method,which is called CHCP,for software defect prediction is proposed based on hierarchical clustering and feature ranking.This method uses the adaptive threshold to control the size of the cluster and the size of the target subset.The experiments show that CHCP performs well for predicing software defect.Compared with CFS,IG,GR and other two feature selection methods,the proposed method improves the accuracy and precision by at least 2.76%and 5%,respectively.Second,in order to slove the problem of low precision in software defect prediction,the ensemble learning is introduced in this paper.The influence of different weak classifiers on the ensemble learning is diccussed,and then the comparison experiments between the single classifier and ensemble learning classifier are preformed.The experimental results show that for AUC value,the three ensemble learning methods,which contain Bagging,AdaBoost and RandomSubSpace,perform better than four single classifiers,including NB,J48 and other two methods.What's more,AdaBoost performed best.It is proved that ensemble classifier is better than the single classifier,and is more suitable for software defect prediction.
Keywords/Search Tags:Software Defect Prediction, Feature Selection, Hierarchical Clustering, Classification, Ensemble Learning
PDF Full Text Request
Related items