A Study On Post-Inspection Defect Predication Based On Capture-recapture Approach

Posted on:2016-02-04

Degree:Doctor

Type:Dissertation

Country:China

Candidate:G P Rong

Full Text:PDF

GTID:1108330482452156

Subject:Application software engineering

Abstract/Summary:

PDF Full Text Request

Post-inspection defects prediction is a critical step to evaluate the quality of soft-ware inspections, and support decisions in quality management hence guarantee the achievement of the objectives in software inspections. However, the fact is that there still lacks a convenient, objective and effective approach to prediction the defects after software inspection. One way to address this is to establish prediction models. However, most prediction models rely on lots of high-quality historical data. Yet attainment of these data is a big challenge. Meanwhile, these process data usually only validates under certain project context and when con-text changes, the models based on the historical data may fail. Therefore, it is not easy to use these models. As a result, practitioners usually rely on subjective judgment in real-world software inspection, carrying high risk on misleading. For example, reason for more than expected number of defects could be either poor quality of the software artifacts or good inspection.By borrowing the Capture-recapture method in biological research, researchers in software engineering tried to design new prediction model free of historical data. Many studies provided promising results on the potential efficacy of Capture-recapture method. Independent of historical data provides the Capture-recapture method great potential to be applied in real-world software projects. However, over the past 20 years, this method only has been adopted in few cases, and what is more, only the maximum likelihood estimator(which according to many studies, is one of the worst estimators) has been adopted.From a pragmatic perspective, this study aims to explore and improve the adoption of the capture-recapture method in real-world software projects. In general, there are three critical questions regarding the adoption of the capture-recapture method, i.e., (1) how to selection a suitable Capture-recapture estima-tor? (2) how to evaluate the estimating result of a certain Capture-recapture ap-plication? (3) how to improve the adoption of the Capture-recapture method?respectively. While current existing studies could not support satisfying answers to these three questions, we tried to apply empirical methods for the investigation.Selection of estimators. While this topic is among one of the most studied research topics regarding the capture-recapture method, current study status can not support the selection well. On one hand, the results of these relevant studies turned out to be inconclusive; on the other hand, current relevant studies usually use small data sets to compare and evaluate the performance of various Capture-recapture estimators. As the consequence, evaluation results are unsteady, which to a certain degree impacts practitionersâ€™understanding on the Capture-recapture method. And it will be more difficult to use the Capture-recapture method based on these understandings.Evaluation of estimating results. This is a topic neglected by most researchers in this area. As a matter of fact, the estimates generated by most Capture-recapture estimators have a fair possibility to be extremely bad estimates. While these extremely bad estimates may bring wrong information and lead to mis-leading understanding, it is necessary to explore effective approaches to identify extremely bad estimates.Improvement to the method. Proposals to improve the application of the Capture-recapture method have been raised in several studies. One of the most mentioned improvements is to improve the number of unique defects detected by the inspection team so as to improve the estimating accuracy, for example, to add inspectors, to improve the reading techniques, to train the inspectors, etc. However, this improvement proposal has never been empirically investigated and it is also not crystal-clear valid based on the calculation methods behind various Capture-recapture estimators.In this thesis, we applied empirical methods(i.e., systematic literature review and controlled experiments) to explore the answers to the above three questions. Results and contribution of the study could be summarized as the follows:1. We provided and verified a selection result from various Capture-recapture estimators. Besides, we also optimized the usage of the selected estimators.2. We designed and verified an approach to evaluate the quality of estimates produced by the Capture-recapture method to identify extreme estimates and avoid misleading decisions.3. We provided verification results to the assumption to a series of the im-provements to the Capture-recapture method, i.e., the higher Detected Rate(DR, the percentage of defects detected to the total number of seeded defects) could lead to more accuracy Capture-recapture estimates, which could be taken as the basis for many similar improvements to the adoption of the Capture-recapture method.

Keywords/Search Tags:

Software Inspection, Defect Prediction, Capture-recapture, Empirical Software Engineering

PDF Full Text Request

Related items

1	Application And Improvement Of Capture-Recapture Method In Software Inspection
2	The Research And Improvement Of Capture-recapture Models In Software Engineering
3	Software Defect Prediction Based On Social Software Engineering
4	Research On Key Technologies Of Defect Analysis In Software Engineering
5	Research And Application Of Software Defect Prediction Based On Latent Dirichlet Allocation
6	Research And Implementation Of Software Defect Prediction Model Construction And Sharing Methods
7	Software Defect Prediction Strategy Design For Imbalanced Data
8	Software Defect Prediction Research For Unlabeled Datasets
9	Statistical aspects of using genetic markers for individual identification in capture-recapture studies
10	Research On Software Defect Prediction Method Based On Training Data Selection