Software Defect Prediction Strategy Design For Imbalanced Data

Posted on:2019-06-21

Degree:Master

Type:Thesis

Country:China

Candidate:Y Niu

Full Text:PDF

GTID:2428330566976387

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Software defect prediction is a hot research topic in software testing.Good prediction strategy can save the test resource,money cost and improve the quality.In this thesis,we focus the following three works associated with software defect prediction:Firstly,there are two crucial problems for software defect prediction: the class imbalance of datasets and parameter settings of support vector machine(SVM).Currently,the scholars mainly focus upon one of them,and this phenomenon may affect the prediction accuracy significantly.In this thesis,a hybrid multi-objective cuckoo search undersampling based on SVM(HOMCS-US-SVM)is proposed aiming to optimize these two problems simultaneously,while probability of false alarm rate and probability of defection are employed as the targets.Furthermore,three different undersampling strategies for class imbalance are designed:(1)samples selected from all non-defect modules uniformly;(2)K-means cluster algorithm is employed to divide all non-defective modules into several clusters,and then samples selected from all clusters uniformly;(3)K-means cluster algorithm is employed to divide all non-defective modules into several clusters,and then samples selected from one cluster with largest modules.To test the performance,eight benchmark datasets are chosen and compared with other eight prediction models.The results show that the proposed strategy three achieves the best performance.Secondly,inspired from the oversampling viewpoint of SMOTE(a well-known oversampling algorithm),we also propose a hybrid multi-objective cuckoo search oversampling based on SVM(HMOCS-SMOTE-SVM).With this method,the neighbor of SMOTE and parameters of SVM are optimized simultaneously.Experiments show that the proposed model can effectively improve the performance of SMOTE.Finally,to tackle the class imbalance problem of the datasets in cross-project software defect prediction,a three-stage data selection prediction model for cross-project problem is designed.In the phase of software project selection,a hybrid similarity measure is proposed to select the similar project.In the phase of instances selection,Burak filter is employed.In the phase of class imbalance,the proposed software defect prediction model(undersampling and oversampling)is employed.The experimental results show that the performance of our proposed models achieves the best performance when compared with other seven prediction algorithms.

Keywords/Search Tags:

Software defect prediction, Class imbalance, SVM, Cross-project software defect prediction

PDF Full Text Request

Related items

1	Research On Data Preprocessing Technology In Cross Project Software Defect Prediction
2	Research And Implementation Of Software Defect Prediction Model Construction And Sharing Methods
3	Research On Software Defect Prediction Based On Extreme Learning Machine
4	Correlation Analysis Based Cross-project Software Defect Prediction
5	Research On Cross-project Software Defect Prediction Method Based On Active Learning
6	Research On Software Defect Prediction Based On Improved Balanced Distribution Adaption Algorithm
7	Research On Software Defect Prediction Method Based On Training Data Selection
8	Study On Cross-Project Defect Prediction Based On Transfer Learning
9	Design And Tool Implementation Of Cross-project Software Defect Prediction Method Via Active Transfer Learning
10	Wide Research Of Data Mining With Machine Learning On Software Defect Prediction