Font Size: a A A

Research On Software Defect Prediction Method Based On Fusion Feature Selection And Ensemble Learning

Posted on:2022-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q T WuFull Text:PDF
GTID:2518306536996759Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer software and hardware,people's food,clothing,housing and transportation have been greatly improved,and their daily lives have become more convenient.However,with the continuous expansion of software scale,the number of software defects has gradually increased,and software quality problems have become more and more serious.If the module that may have defects in the software can be predicted in advance,and more testing resources can be invested in this module,the testing efficiency can be improved and the software quality can be guaranteed.Therefore,the research on software defect prediction technology is becoming more and more important.Based on machine learning methods,this paper studies software defect prediction technology,introduces data preprocessing methods and defect prediction implementation methods for software defect data sets,and improves the quality of software defect data sets and defect prediction performance.The main content of the paper as follows.First of all,a data processing algorithm is proposed for the high-dimensionality and imbalance of the software defect data set.For the problem of data high-dimensionality,a fusion feature selection algorithm is proposed.Through the analysis of the correlation between features and categories and the redundancy between features and features,the optimal feature subset is selected to achieve the purpose of dimensionality reduction.Use the combined sampling method to deal with the problem of class imbalance,focusing on synthesizing a minority of defect samples,removing noise samples on the premise of ensuring that the data set is balanced,and improving data quality.Secondly,a weighted ensemble learning algorithm is proposed to solve the problem that a single classifier has limited predictive performance on diversely distributed defect data.By selecting different base classifiers to increase the diversity of classifiers in defect prediction,and according to the AUC(Area Under the Curve)value of each classifier's classification of the test set,multiple classifiers are weighted and integrated to obtain the final prediction result,which is improving While predicting performance,it also enhances the wide range of model applications.The ensemble learning algorithm and data processing algorithm are combined to construct a software defect prediction method based on weighted ensemble learning.Finally,the data processing algorithm,weighted ensemble learning algorithm and the constructed software defect prediction method proposed in this paper are experimentally verified and the results analyzed.By setting up different comparative experiments,it is confirmed that the proposed algorithm and the constructed prediction method are improving software defects.Effectiveness of forecasting efficiency and accuracy.
Keywords/Search Tags:software defect prediction, high dimensionality, class imbalance, feature selection, ensemble learning
PDF Full Text Request
Related items