Font Size: a A A

Researches And Applies On Software Defect Prediction Method Based On Ensemble Learning

Posted on:2021-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Y MuFull Text:PDF
GTID:2428330611462808Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays,computer software is widely used in various fields,as the demand continues to increase,the complexity of functions is getting higher and higher.However,as the scale of the system continues to expand,its quality problems become more prominent,then software testing is especially important.Software testing usually applies to detect defects in the system to avoid serious accidents.However,in actual work,due to the impact of factors such as software project time and labor costs,software testing has not fully covered the entire project,so that the software may contain hidden defects.In the software life cycle,if the later the internal drawback is detected,the cost of repairing the shortcoming will be higher.If a problem occurs after the software is released,the cost of detecting and repairing will be better.Therefore,this article focuses on software defect prediction techniques that can identify program modules;it may contain defects in advance and allocate sufficient test resources.The current software defect prediction problem is mainly solved by studying machine learning.This paper considers the characteristics of software defect prediction technology in this direction and finds that there are primary including the following problems: 1.There are generally a lot of redundant or irrelevant features in software defect data.And those excess data features will seriously affect the performance of the defect prediction model;2.Software defect data has serious class imbalances situation,that is,the proportion of positive and negative sample data is very different,which will make the model generalization ability and lose the significance of the model construction.3.The available of current software defect prediction classifiers are many types,and most of them are single classifiers,but once the accuracy of this type of algorithm reaches a certain level will enter the bottleneck period,it is hard to continue to optimize the algorithm;4.The current software engineering environment is complex and diverse.How to construct a universal software defect prediction system for different development environments and development languages is imperative.In response to the above problems,this article has carried out research and engineering work from the following aspects:1.Given the problem of redundant or irrelevant features in the defect prediction data set,this article uses the feature selection method to optimize the characteristics,which mainly uses the information gain rate to select the optimal feature for the experiment.After experimental verification,this method can effectively improve the accuracy of the experimental results.2.Because of the drawback imbalance problem in the defect prediction data set,this article uses sampling technology,which mainly includes oversampling and undersampling.First,the minority samples and the majority samples are balanced by oversampling.Then the majority samples are removed by undersampling to achieve the balance of positive and negative samples in the data set.After the experimental verification,sampling technology can effectively balance the experimental data and improve the model generalization ability.3.Aiming at the problem of single classifier prediction accuracy in the defect prediction experiment.In this paper,the idea of integrated learning is used.For the first time,integrated algorithms Stacking,Bagging and AdaBoost are introduced in the field of software defect prediction,Logistic,Decision tree,Naive Bayesian and Neural Networks are used as the base classifier for integrated learning.Through a combination of multiple algorithms,it is found that the experimental result of the integrated base classifier is better than a single classifier.The AdaBoost integrated algorithm with the best experimental effect is obtained with the base classifier uses Decision tree to build a defect prediction model.4.Engineering application of research results.Based on the previous research results and the paper designs,the article develops and establishes a software defect prediction system.At the same time,the system is used in the test work of two software projects with different development languages and development environments.The defect prediction results obtained by the experiment are compared with the actual software defects.After experimental verification,the system can effectively predict defects in different projects,which has essential engineering application value for actual software testing.
Keywords/Search Tags:Defect prediction, Feature selection, Class imbalance processing, Integrated learning, AdaBoost algorithm
PDF Full Text Request
Related items