Bug Triage Based On Data Reduction

Posted on:2014-02-20

Degree:Master

Type:Thesis

Country:China

Candidate:W Q Zou

Full Text:PDF

GTID:2248330395498862

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Bug fixing is an important process in software development and maintenance.Bug triage, i.e., assigning a new bug to an appropriate fixer, is the key step of bug fixing. The main approaches to address the bug triage problem are based on text classification. However, these approaches suffer from the large-scale and low-quality data sets.In this thesis, the data reduction technique based on feature selection and instance selection is proposed to improve the accuracy of bug triage. Data reduction includes two aspects, to reduce the scale and to increase the quality. Feature selection and instance selection techniques are combined to achieve this objective. To evaluate the effectiveness of the data reduction technique, two feature selection algorithms and two instance selection algorithms are conducted on Eclipse, Gnome and NetBeans. For each data set,70%of words and50%of instances are removed. Experimental results show that the final data sets can achieve better accuracy than the original ones.From the experimental results over three project data sets, we find that the order of feature selection and instance selection has a strong impact on the final triage accuracy on different data sets. Hence, to correctly provide the best combination order for a new data set, an order prediction model is built. Continuous300000bug reports from bug repositories of Eclipse and Mozilla are chosen, and resampled to get data sets of different sizes. Each data set has18attributes. Experimental results show that the prediction model based on decision tree can achieve the accuracy of71.8%.

Keywords/Search Tags:

Bug Triage, Data Reduction, Order Prediction Model, Feature Selection, Instance Selection

PDF Full Text Request

Related items

1	Research On The Data Set Reduction Method For Bug Triaging
2	Research On Prototype And Attribute Collaborative Reduction Of Unsupervised Symbol Data
3	Key Technology Research On Parallel Instance Selection In Data Reduction
4	Research On Software Defect Prediction Based On Feature Selection And Instance Transfer
5	Instance Selection Based On Boundary Feature And Surrogate Model
6	Research On Dual Reconstruction For Feature And Instance Selection
7	Research On Data Preprocessing Technologies For Software Defect Prediction
8	Research And Application Of Dimension Reduction Method Based On Feature Selection
9	Instance Selection Strategy For Time Series Classification
10	Research On Embedding-Space-Based Multiple-Instance Learning Algorithms With Feature Selection