Font Size: a A A

Automatic Bug Triage Based On Active Learning

Posted on:2014-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:L K LiFull Text:PDF
GTID:2248330395998869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Software bug triage is a process to assign a newly submitted bug to its corresponding developer, who should fix the bug. In bug triage, expert developers manually assign bugs to their developers based on the submitted description by users. In large open source projects, such as Mozilla and Eclipse, more and more time and resources of developers are occupied by bug triage. This thesis analyzes existing bug triage methods and finds out the following two issues:(1) In traditional manual triage, the number of submitted bug per day is large. Since developers are always lack of corresponding knowledge, manual triage will cost a great amount of money and time.(2) In automatic triage methods, labeling cost is very expensive. To ensure the accuracy of building a categorization model, a large amount of labeled bugs are needed.In this paper, an automatic bug triage approach based on active learning is proposed. This approach analyzes bug reports from the perspective of machine learning. Based on building a training set with the description of bugs, developers are mapped onto bug categories and a Naive Bayesian model is established. Meanwhile, the method of query by committee is used, which selects the most controversial bugs to ask for their category labels for active learning. Experiments are conducted on186774bugs from the bug repository of the Eclipse project. First, three Native Bayesian classifiers are trained with1%of training data. These three classifiers vote for the label of bugs. Then bugs with the largest entropies are selected and added to the training set, which is used to retrain the three classifiers. Finally, the bugs in the test set are labeled with these three new classifiers. Experimental results show that in average, the automatic triage approach based on active learning assigns4.1bugs per second. This approach achieves a better classification results in25.541%of bugs when the initial training set is1%of bugs.
Keywords/Search Tags:Automatic Bug Triage, Active Learning, Native Bayesian, Query byCommittee
PDF Full Text Request
Related items