Font Size: a A A

Research Of The Software Defect Prediction Method For Imbalanced Data

Posted on:2015-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:L YangFull Text:PDF
GTID:2308330503475089Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing complexity and enhancing functionality of software, it need consume a lot of manpower and resources to identify the risk module. So, to identify the defective module in time is critical to ensure the product quality of high-reliability software. Many scholars have made a lot of researches and got a lot of research achievements in order to improve the software reliability. The previous studies, however, often ignore the problem of imbalanced data in software. Meanwhile, the traditional methods just improve the accuracy of software defect prediction, but not pay attention to the efficiency of software defect prediction, which lead to the result of software defect prediction is insufficiency and its practicality is limited.To solve the problem mentioned above, in this paper, we consider the software defect prediction for imbalanced data as research object, propose the model of software defect prediction base on RSFSBoost and discuss the basic theory and method of software defect prediction for imbalanced data. We concern how to effectively process the data in the process of software defect prediction, how to select the feature in the imbalanced data sets, and how to ensure the efficiency of the software defect prediction as well as the accuracy. Then we can find out the high-risk software module in high reliability software in time and concentrate limited manpower and resource to repair the high-risk software module. Firstly, we proposed the data processing method based on mixed sampling technique(RSmote), which can resolve effectively the problem of imbalanced data in the situation of the total number of samples is limited. Secondly, we proposed the data processing method based on mixed sampling technique and feature selection, which resolves the coexist problem of imbalanced data and features redundant. Finally, we constructed the model of software defect prediction based on RSFSBoost, which considers the problem of imbalanced data in software sufficiently and ensures the efficiency of the software defect prediction as well as the accuracy. The research can help us find out the high-risk software module in high reliability software in time and concentrate limited manpower and resources to fix the high-risk software module, and improve software quality and software reliability.The method and model proposed in this paper are verified by experiments whose results show that the software defect prediction method based on RSFSBoost ensures the efficiency of the software defect prediction as well as the accuracy, which can provide the technical support for the software developer and help them improve the software quality and the software reliability.
Keywords/Search Tags:software defect prediction, imbalanced data, feature selection, RSFSBoost
PDF Full Text Request
Related items