A Study On Software Defects Prediction Based On Machine Learning

Posted on:2019-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Bao

Full Text:PDF

GTID:2428330596464635

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of smart tools,software updates are more and more iterative.Finding software defects and providing solutions in advance can effectively reduce a lot of labor costs and time costs.The model of software defects prediction which based on machine learning can predict and find software defects quickly,and help testers rationalize resources and test defective modules preferentially,which is an efficient software defect prediction method that can reduce losses and ensure the quality of software.However,there are usually two problems in software defect prediction technology based to machine learning: the characteristics of imbalanced data and cost sensitive.In real life,software defects only exist in a small number of software modules,and the number of defective modules is much smaller than the number of normal modules.Therefore,the issue of software defects belongs to the problem of imbalanced data.The problem of imbalanced data often affects the accuracy of classification in traditional machine learning.The second problem is cost sensitive.In the traditional classification learning algorithm,it is assumed that the different types of errors generated by the classifier will result in the same cost.However,in practical applications,the different types of errors generated by the classifier will result in different costs,for instant,the cost of identifying a defective software module as a flawless software module is far greater than the cost of identifying a flawless software module as a defective software.The former wastes only labor,material resources,and time to test non-defective modules that are misclassified as defective modules,but the latter can directly lead to software errors or even software paralysis.This article has made the following work based on the characteristics of software defect datasets.1.We use IMMFIA(with low time complexity and space complexity)to get frequent itemsets,and generate association rules that satisfy the confidence and support thresholds,and prioritize small classes(defective software modules)based on relevance and new rules,getting classifier.For issue of mismatching rule(for test cases that cannot be satisfied by the rules in the classifier)and matching spills rule(for the case where there are multiple rules to satisfy the test case),we use EDSVM to classify.Experiments show that compared with the current software defect prediction method,FREDAVM has high precision.2.Combining cost sensitive,we construct a new loss function and propose a software defect prediction model CostXGBoost algorithm based on XGBoost.Compared with the related software defect prediction model on the NASA datasets,the results show that the CostXGBoost has higher precision and recall than the traditional software defect prediction model.

Keywords/Search Tags:

Software defect prediction, rule matching, CostXGBoost, imbalanced data, cost sensitive

PDF Full Text Request

Related items

1	Research On Software Defect Prediction Method Based On Cost Sensitive Learning Adacost
2	Research On Software Defect Prediction Algorithm Based On Cost-sensitive Learning
3	Research On Software Defect Prediction Method Based On Cost Sensitive Learning
4	A Cost-sensitive Hybrid Software Defect Prediction Model
5	Research On Fusion Cost Sensitive Sampling And Integration Algorithms In Software Defect Prediction
6	Cost-Sensitive Feature Selection Algorithms With Application In Software Defect Prediction
7	Research Of The Software Defect Prediction Method For Imbalanced Data
8	Software Defect Prediction Strategy Design For Imbalanced Data
9	Research On Software Defect Prediction Methods
10	Research On Methods Of Ranking-Oriented Software Defect Prediction