Research On Software Defect Prediction Method Based On Training Data Selection

Posted on:2018-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:S Q Peng

Full Text:PDF

GTID:2348330512497931

Subject:Systems analysis and integration

Abstract/Summary:

PDF Full Text Request

The unprecedented development of the Internet,completely changed our way of life,the role of software play also more and more prominent,has penetrated into all aspects of our lives,resulting in people's software quality requirements are getting higher and higher.As we all know,software maintenance costs account for about 70%of the total software development costs,software defects prediction and repair is one of the main tasks.Software defect prediction helps identify the most likely problems of the module,so reasonable allocation of test resources,improve the software development process,improve the quality of development,software engineering has been a hot topic of concern.The traditional software defect prediction method is to use the historical data of the project itself to establish the prediction model,and then used for the follow-up version of the defect prediction.High-quality forecasting models require sufficient historical data,which is difficult for some new projects or software projects that are not yet active.In recent years,more and more data has been available on the Internet,and some researchers have used other similar software project data to train and construct cross-project defect prediction models to solve the problem of traditional defect prediction for historical data bottleneck.However,there are already working in the cross-project training data selection mostly based on the similarity of the source code metrics,but ignores the defect attribute information,such as the number of defects.In fact,in the data selection process,when there are multiple training instances with a target instance have the same similarity value,you need to determine which should be preferred which or several examples.From an experiential software engineering point of view,training instances with more flawed numbers will be preferred because these examples contain more informative information.Therefore,this paper introduces a new training data selection cross-project software defect prediction method by introducing the information of defect quantity.(1)Based on the commonly used source code metrics,consider the introduction of specific defect information to calculate the similarity between instances,and use five different typical standardized methods for defect information.(2)Explore the three commonly used similarities and(1)use different standardized methods for defect information,and discuss the quality of different training examples.(3)Based on six typical single classifiers(LR,J48,NB,SVM,KNN and RF),the defect prediction integration model is established,and the advantages of each single classifier are fully utilized,and the performance evaluation index F-measure is used to evaluate The predictive performance of each classifier is analyzed,and the voting integration and weighted integration are proposed to predict whether the target instance is defective.In order to verify the rationality and correctness of the idea of this paper,the results show that:(1)It is helpful to improve the quality of cross-project defect data by introducing the defect quantity information;(2)Using different similarity measures and standardized methods to deal with the impact of the impact of data on the forecast results,which use the Manhattan distance measurement examples of source code index similarity or linear standardization method to deal with the number of defects when the performance is better;(3)The proposed model is weighted and integrated to further improve the prediction performance.

Keywords/Search Tags:

Software quality assurance, defect prediction, cross project defect prediction, Similarity

PDF Full Text Request

Related items

1	Software Defect Prediction Strategy Design For Imbalanced Data
2	Research On Some Key Technologies Of Software Defect Prediction
3	Research On Cross-Project Software Defect Prediction
4	Design And Implementation Of Instance Selection Based Ensemble Cross-project Defect Prediction Method
5	Cross-Project Software Defect Prediction Methods Based On Autoencoder
6	Research On Cross-project Software Defect Prediction By Transfer Learning
7	Research On Cross-project Knowledge Reuse Technology For Defect Management
8	Cross-project Software Defect Prediction Based On Machine Learning
9	Research On Software Defect Prediction Based On Extreme Learning Machine
10	Research On Cross-Project Software Defect Prediction