Font Size: a A A

An Empirical Study Of The Impact Of Model Building And Evaluation Methods On The Performance Of Defect Prediction Models

Posted on:2020-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y BinFull Text:PDF
GTID:2518305732477324Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The limited Software Quality Assurance(SQA)resources of software develop teams take more effort into software modules(e.g.class in source code)which have a trend to be defective in the future.Therefore,the software defect predict model is trained to identify whether the modules are defective using machine learning techniques(or classification techniques).Defect predict model's construction and evaluation is affected by multiple factors,for example,imbalanced data,automated parameter optimization for machine learning techniques and model evaluation methods.Imbalanced data means that software defect data consist of only a few defective modules and a huge number of non-defective ones.In practice,learning from imbalanced data might cause the poor performance for model.Classification techniques often have configurable parameters that control the performance of the classifiers that they produce.Since the optimal parameters settings are unknown ahead of time,the default values are usually used in modelling.Indeed,parameter tuning is necessary when these default parameters setting cause classifiers underperformance.Different model evaluation techniques will get different evaluation results.Since existing studies employ different evaluation techniques,it is not uncommon to see that conflict conclusions are reproted.Selecting the optimal model evaluation technique could help us getting a more reasonable conclusion.Currently,there are few works do a detail united research for these factors in the same data sets,especially for model evaluation methods.Furthermore,different researchers might get conflict conclusions about the impact of these factor.In order to study the impact of these factor in detail,this paper do empirically studies through case studies of systems that span both proprietary and open-source domains.The main contributions of this thesis are summarized as the following:(1)Impact of imbalanced data:In terms of Logistic Regression,Random Forest,for the threshold-dependent metrics(F-score,G-measure,Balance,MCC),the model built with balanced data significantly perform better than model with model built with imbalance data.For the threshold-independent metric(AUC),the conclusion is similar as for the threshold-dependent metrics.But in terms of Naive Bayes,the performance improvement is not significant.Conversely,for the indicator false alarm rate(pf),imbalanced data processing leads to a bad performance.Therefore,we get conclusion that,for particular classification technique and indicators,imbalanced data processing should be carefully consider.(2)Impact of automated parameter optimization for classification techniques:automated parameter optimization significantly improves the performance for most classification techniques(not all).For particular classification technique,parameter tuning should be carefully considered.Meanwhile,Optimized classifiers are at least as stable as classifiers that are trained using the default settings,and automated parameter optimization do not make classifier more unstable.(3)Impact of model evaluation methods:For global-based model evaluation methods,the similar ratio between conclusions got by CD graph and conclusions got by ScottKnottESD is less than 50%,but the similar ratio between conclusions got by algorithm graph and conclusions got by Scott-KnottESD and KnottESD is nearly 70%.Meanwhile,the conclusion got by win/tie/loss analysis(LOCal-based method)is almost the same as by algorithm graph.When revisiting the performance of classifiers,we find the conclusion got by algorithm graph is similar with previous research.Therefore,we recommend algorithm graph as model evaluation method when one research aims to compare with many different previous researches.
Keywords/Search Tags:Defect predict, Imbalanced data, Automated parameter optimization, Model evaluation methods, Empirically study
PDF Full Text Request
Related items