Ensemble Learning For Software Defect Detection

Posted on:2016-08-11

Degree:Master

Type:Thesis

Country:China

Candidate:W C Huang

Full Text:PDF

GTID:2308330473465515

Subject:Information security

Abstract/Summary:

PDF Full Text Request

In recent years, software defect prediction is one of the most important research topics in software engineering which can be generally categorized into two types: dynamic and static defect prediction. Static detection is to predict the software defect using software historical data, which has a high applicability and accuracy and has been extensively researched and used. The key for static defect prediction is how to make the analysis of software historical data in order to establish an accurate classification model to distinguish between defective software and defect-free software.For the software defect detection, the samples of defective software will be much less than defect-free software samples, which lead to the serious class-imbalance problem. How to solve this problem becomes a critical issue for software defect detection. Generally, re-sampling methods or cost-sensitive methods would be used to deal with it. Here, we use re-sampling based Bagging method. Every time training the weak classifiers, we re-sample a subset of samples which is classbalance. By fusion of weak classifiers, we could get strong classifier to improve the generalization ability and the classification accuracy of the model.In ensemble learning methods, if we want to further improve the classification performance, we can improve the single classifier, or improve the randomness of weak classifier, we can also find a suitable fusion method. We used down-sampling method to improve the single classifiers above. On this basis, we increase the randomness, optimize the fusion methods to further improve the model results.In ensemble learning, the more independence the weak classifiers are, the better the final result would achieve. In contrast to the random-sample based methods, the random-feature based method would get more independent weak classifiers and obtain a better stability and accuracy strong classifiers. In this paper, we propose a novel approach that employs feature structural based random subspace method for software defect prediction, which further improves the final result.By the above method, we get a series of weak classifiers. Due to the special nature of dichotomous, the general accuracy could not accurately describe the effect of dichotomous model. Here, we propose a weighted fusion method based on the comprehensive evaluation index F-measure to further optimization of the model results.For all methods we proposed, we have experiments on NASA software defect database and compared our method with some of the most popular methods this years. Experimental shows that our method gets the best results on the ten NASA software defect databases.

Keywords/Search Tags:

Software defect detection, Improved Bagging, Feature Construction, Random Feature Subspace, Classifier Fusion

PDF Full Text Request

Related items

1	Research On Locality Preserving Subspace Methods For Facial Feature Extraction And Recognition
2	Research And Application Of Feature Selection For Software Defect Data
3	The Design Of Classifier On Gastric Mucosa Tumor Microscopic Image
4	The Based On Improved Image Saliency Characteristic Detection Billet Surface Defect Detecting Technology
5	Research On Surface Defect Recognition Of Steel Strips Based On AdaBoost Classifier
6	The Realization Of High-precision Image Classifier Based On Feature Subspace
7	Solder Ball Defect Detection Of BGA Chip Based On Machine Vision
8	Object Tracking Algorithm Based On Multi-Feature Fusion And Selection
9	Pedestrians Detection Using Feature Fusion In Static Image
10	An Improved SSD Object Detection Algorithm Based On Multi-layer Feature Fusion