Source code retrieval for bug localization using latent Dirichlet allocation, and its relationship to stability of agilely developed software

Posted on:2010-12-26

Degree:Ph.D

Type:Dissertation

University:The University of Alabama in Huntsville

Candidate:Lukins, Stacy K

Full Text:PDF

GTID:1448390002977119

Subject:Computer Science

Abstract/Summary:

In bug localization, a developer uses information about a bug to locate the portion of the source code to modify to correct the bug. Developers expend considerable effort performing this task. Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI); however, latent Dirichlet allocation (LDA), a modular and extensible IR model, has significant advantages over both LSI and probabilistic LSI (pLSI). We present an LDA-based static technique for automating bug localization.;We apply our technique to agile software systems, which are becoming increasingly common in the current world of software development. Agile software development practices, characterized by frequent software releases with associated numerous changes, can potentially result in unstable code. Since our approach is based on semantic information found in the code, such as comments and identifiers, any source code modifications that change the content of or reduce the amount of semantic information present have the potential to negatively affect our technique. Therefore, we investigate whether our approach works similarly on all iterations of agilely developed software or whether code changes resulting in unstable software adversely affect our approach.;We present four case studies designed to determine the effectiveness of our LDA-based approach to bug localization as well as to study the relationship between the accuracy of the approach and the instability of agile software systems. The case studies involve the examination of over 300 bugs across twenty-five iterations of two agilely developed software systems.;The results show our LDA approach to be an effective technique for bug localization. The results also show our technique performs at least as well as the LSI technique for all bugs, and performs better, often considerably so, than the LSI-based technique for most bugs. From our study, we are able to conclude no significant relationship exists between stability and the accuracy of our approach, based on the visual examination of scatter plots together with the lack of significant correlations between the two variables.

Keywords/Search Tags:

Bug localization, Source code, Software, Agilely developed, Approach, Latent, Relationship, Information

Related items

1	Research On Relationship Between Code Quality And Software Defects For Open Source Software
2	An Outer Approximate Approach to TOA-Based Multiple Source Localization Proble
3	Combining information retrieval modules and structural information for source code bug localization and feature location
4	A source code search engine for keyword based structural relationship search
5	Investigating Rostral Anterior Cingulate Cortex in Major Depression: An EEG Source Localization Approach
6	Source code retrieval from large software libraries for automatic bug localization
7	A Tool-Chain Approach to Source Lines of Code Estimation Using Petri Nets and Directed Graphs
8	Research On Source Code Analysis, Display And Application Based On SrcML
9	Research On Software Defect Prediction Method Based On Semantic Information Of Program Source Code
10	An information retrieval approach to concept location in source code