Learning to Rank Relevant Files for Bug Reports Using Domain knowledge, Replication and Extension of a Learning-to-Rank Approac

Posted on:2019-08-31

Degree:M.S

Type:Thesis

University:Rochester Institute of Technology

Candidate:Safdari, Nasir

Full Text:PDF

GTID:2478390017989401

Subject:Computer Science

Abstract/Summary:

Bug localization is one of the most important stages of the bug fixing process. Bad practices make the debugging a tedious task. Investigating bugs can contribute up to a large portion of the aggregate cost for a software project. An automated strategy that can provide a ranked list of source code files with respect to how likely they contain the root cause of the problem would help the development teams to decrease the search space and leads to increase in the productivity. In this work, I have replicated the bug localization approach presented in cite{ye2014learning} that applies the learning-to-rank technique to rank the relevant files for each bug. This technique applies domain knowledge by evaluating the textual similarity between bug reports and source code files and API specification documents plus bug fixing and code alteration history. For a given bug report, the ranking function is constructed based on the linear combination of weighted features where the features are trained on previously solved bug reports. In addition to replication of the mentioned technique, I have extended the study by evaluating the role of different text preprocessing techniques such as Stemming and Lemmatization and also a randomized selection of training folds on the overall performance of the ranking model. I found that Lemmatization of the words and randomized selection of the training folds have an adverse effect on the performance of the ranking model and consequently having lower accuracy and precision of the results.

Keywords/Search Tags:

Bug reports, Files

Related items

1	Research On Application Software Of Laser Engraving System
2	Research And Implementation Of Anti-attack Technology For The Files Of Computer
3	Localizing Relevant Source Code Files For Bug Reports
4	Research Of Problems And Countermeasures Of Abandoning Files
5	Research And Application Of ARP Virus Protection Based On Log Files
6	A Study On The Windows Log Forensic And Recovery Technology
7	Research On The Key Technologies For Wireless P2P Files Sharing Systems
8	The Design And Implementation Of Finding Mechanism Of Isolated Web Files
9	A Research And Design On Files Customization System In Distributed Environment
10	The Research And Implementation Of Method Regarding To The Small Files Problem Of Hadoop