Font Size: a A A

Research On Software Defect Prediction Based On Machine Learning

Posted on:2020-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:P LiuFull Text:PDF
GTID:2428330575462051Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Software defect prediction,as one of the hot areas of computer science research today,has made significant progress in the last decade.The main purpose of software defect prediction is to provide guidance for software testers,and to guide software testers to focus on testing software systems where problems are easy to occur,and to avoid testing personnel to waste a lot of manpower and material costs in the place without problems.The software defect prediction is mainly through the analysis of historical data and the extraction of software features,and the defect prediction model is used to mine the hidden defects in the software.Software defect prediction not only enriches the defect prediction theory,but also drives the application of new methods in computer science,so it is of great significance.This dissertation studies two aspects of feature selection and software defect prediction model.The main research contents of this dissertation are as follows:For the defect data set,there are too few defective data,too many redundant items in the data set,and the class imbalance and feature in the data set are not obvious.This dissertation proposes a weighted nearest neighbor feature selection method.The method improves the nearest neighbor feature selection method,uses distance weighting and attribute weighting to update the weights,assigns different weights to different attributes and different distances,and prioritizes features with high weights.The feature selection method proposed in this dissertation is carried out on the NASA public dataset.The experiment uses the four common feature selection methods of RF,OR,CL and GR and the proposed weighted nearest neighbor feature selection method to compare the experiments and draw 10 A comparison of the experimental results of the amplitude-weighted nearest neighbor feature selection method and the RF method.At the same time,in order to further prove that the proposed weighted nearest neighbor feature selection method is effective,the Wilcoxon signed rank test method and Cohen's effector method are used to statistically analyze the results of the experiment.The experimental results show that the weighted nearest neighbor feature selection method proposed in this dissertation is better than the above four methods.Aiming at the low accuracy of software defect prediction in existing models,this dissertation proposes a software defect prediction method based on association rules and artificial neural network(GRAR_ANN).The method consists of two parts: data processingand model training.Data processing is based on the feature selection in the data set using the weighted nearest neighbor feature selection method.The model training consists of the GRAR-Mining algorithm and the GRAR classification algorithm.The data set after feature selection is trained on the two artificial neural networks(MLP and RBFN)selected in this dissertation to obtain the GRAR-Mining algorithm,and the output of the GRAR-Mining algorithm is used as the input of the GRAR classification algorithm.The data set with different feature and sample distribution is different from the training set as the test set.According to this rule,10 public data set design experiments are selected for verification,and compared with the results of common 15 defect prediction models.The experimental results show that the accuracy of the proposed GRAR_ANN defect prediction method is about 5%higher than that of the comparison method.Therefore,the GRAR_ANN defect prediction method has better experimental results than the similar prediction method.
Keywords/Search Tags:Software Defect Prediction, Machine Learning, Feature Selection, Association Rules, Artificial Neural Network
PDF Full Text Request
Related items