Font Size: a A A

Research On Automatic Software Defect Prediction Model Selection Based On Bi-Level Optimization

Posted on:2022-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z L XiangFull Text:PDF
GTID:2518306524989719Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the software system becoming more and more popular in modern society,how to avoid the influence of software defects so as to ensure the stable operation of software system is becoming more and more important.Software defect prediction can use the target project data to predict which parts of the software system may have defects.On this basis,engineers can reasonably arrange the limited resources to ensure the software quality,which can greatly reduce the influence of defects on the software system.Cross-Project Defect Prediction(CPDP),which borrows data from similar projects by combining a transfer learner with a classifier,have emerged as a promising way to predict software defects when the available data about the target project is insufficient.However,developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings.In this thesis,we propose a tool,dubbed BiLO-CPDP,which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming.It is using bi-level optimization to solve the CPDP model pa-rameter selection problem,the parameters including the combination of transfer learning algorithms and classifiers,also includes their corresponding parameters In particular,the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner.Specifically,the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimiza-tion routine aims to optimize the corresponding hyper-parameter settings.In order to find the optimal parameters at each level,after trying a variety of optimization algorithms,we used Tabu search to solve combinatorial problems at the upper level and TPE to solve ex-pensive hyperparameter selection problems at the lower level.To evaluate BiLO-CPDP,we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques,along with its single-level optimization variant and Auto-Sklearn,a state-of-the-art automated machine learning tool.Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects,while be- ing over-whelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases.Furthermore,the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level,which significantly boosts the performance.In a nutshell,by automatically selecting the parameters related to the model,BiLO-CPDP realizes the automatic discovery of cross-project software defect prediction models,and in most cases,the discovered models have better prediction performance than the existing models,which points out a different path for the construction of cross-project defect prediction models in the future.
Keywords/Search Tags:Cross-project defect prediction, transfer learning, classification techniques, automated parameter optimization, configurable software and tool Comparison
PDF Full Text Request
Related items