Font Size: a A A

Software Defect Prediction Based On Feature Extraction And Cost-sensitive Learning

Posted on:2016-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LuFull Text:PDF
GTID:2308330473465464Subject:Machine learning and pattern recognition
Abstract/Summary:PDF Full Text Request
Software defect has become a very important problem to be solved in the process of the software development. Software defect prediction technology based on machine learning obtains historical data from existed software modules and predicts other new software modules. In this paper, we focus on the research of redundancy features and the imbalance class problem, and propose a series of novel methods:(1) In order to solve the redundancy features in software modules, we propose a new method named Supervised Kernel Laplacian Eigenmaps(SKLE), which applies kernel techniques and supervised learning to Laplacian Eigenmaps(LE). We also present a theoretical analysis and a comparison experiment on NASA software database with existed feature extraction methods. Related results show that low-dimensional feature extracted by SKLE can effectively eliminate redundant information between software modules and obtain higher F-measure value.(2) Taking into account the imbalance class problem and misclassification cost problem, we propose a new method named Cost Sensitive Back Propagation Neural Network(CSBPNN). We add the cost sensitive information in the error function and adjust the weights and bias in BPNN. Experiments show that our method can improve the recall rate and F-measure value in software defect prediction.(3) Considering that single classifier has limited classification capacity, we propose a new method named Cost Sensitive Integrated Learning(Cost-Adaboost) to improve the algorithm’s performance. Compared with the latest several software defect prediction methods, Cost-Adaboost not only has the best F-measure value but also reduces the misclassification cost in software defect prediction.
Keywords/Search Tags:Software Defect Prediction, Feature Extraction, Price Sensitive, Integrated Learning
PDF Full Text Request
Related items