Font Size: a A A

Research On Software Defect Prediction For Cross-version Software

Posted on:2019-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2428330596451112Subject:Engineering
Abstract/Summary:PDF Full Text Request
Software defect prediction technology can identify the latent defect information in the software module in advance,and guide the test resources allocation and management decision.In the cross-versions of the software,because the new version of the software is not enough labeled samples and use the pre-version of data for training,but the pre-version data usually has "lag"-modules with similar feature attributes are still predicted to be flawed after the defect has been repaired.We put forward the corresponding solution ideas:Aiming at the problem of mining irrelevant features and redundant features,a method of feature subset selection based on clustering analysis results is proposed.In order to solve the problem of "small sample" in new version of software,Migration thinking,the use of pre-version data for more effective defect prediction.The main work is as follows:(1)Aiming at the problem of high dimension and low prediction accuracy caused by the redundancy between features in the prediction of software defects,a feature selection algorithm combining clustering analysis and subset selection is proposed.Firstly,the sample data set is clustered to obtain the clustering result.Then based on the result of the sample clustering,the feature subset of the package is selected to obtain the optimal feature subset.The clustering method solves the problem of large search space in subset search,and the selection of package subset further reduces the redundancy between features.The experiment on the NASA public data set shows that the proposed method can effectively reduce the redundancy rate of the feature subset and improve the performance of the prediction model effectively.(2)Aiming at the problem of less training data in the new version of software prediction,the idea of instance migration is introduced and the improved Boosting method is proposed.In the process of model training,the improved misclassification cost is added,the weight of samples in the pre-version is dynamically adjusted,The target version predicts favorable samples to reduce the interference of the misclassified samples on the model.At the same time,with the change metric related to defects in the process of software version evolution,defect prediction is made on the target version.By validating the predictive model on a public dataset,the results demonstrate that this approach effectively improves the performance of the predictive model.(3)A cross version software defect prediction system is implemented,through the integration of different data preprocessing methods,feature selection and classification learning algorithm,consisting of 100 learning rules for different data sets to choose the best combination ofalgorithms,to achieve the optimal prediction results.
Keywords/Search Tags:Software defect prediction, Feature selection, Transfer learning, Semi-supervised learning
PDF Full Text Request
Related items