Font Size: a A A

Targeting Cross-project Bugs In Software Ecosystems:Understanding And Analysis Techniques

Posted on:2020-07-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:W W Y MaFull Text:PDF
GTID:1368330578482745Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
GitHub,the most popular social-software-development platform,has fostered a variety of software ecosystems where individual projects rely on the infrastructures or functional components provided by other projects,leading to complex inter-dependenc-ies.Through inter-project dependencies,a bug in an upstream project may have pro-found impact on a large number of downstream projects,resulting in cross-project bugs.Compared with within-project bugs,cross-project bugs have wider impact through-out the ecosystem.Their fixing process is more complex,involving the collaboration between developers from multiple projects.Moreover,the exposure of cross-project bugs is more difficult since some of them only break downstream projects so that up-stream test suites may not capture them.As the increasing popolarity of GitHub and the rapid development of software ecosystem,the number of cross-project bug is grow-ing.Their differences with within-project bugs cause the existing conclusions of bug understanding and analysis not to work.Similarly,the techniques and tools for detect-ing and fixing with-project bugs may not be applicable to cross-project bugs.However,past research studies on software ecosystems and bugs have not paid enough attention on cross-project bugs,expecially lacking of the investigation on its fixing process and the supporting tools for detection and repairing with considering the characteristics of software ecosystems.Therefore,this thesis mainly focuses on the understanding of cross-projects and on the design of techniques for supporting its detection and fixing.Specifically,our first work focuses on how developers deal with cross-project bugs,especially targeting on two unique difficulties,i.e.,cross-project root-cause tracking and the coordination between upstream and downstream developers.We attempt to explore the common practices when facing the two difficulties and gain which kinds of techniques and tools are needed,in order to inspire the following work and provide empirical supports.Then,according to the developers' requirements obtained from the empirical study,we design approaches for promoting the quality of patches and for detecting potential cross-project bugs.In particular,our second work aims to identify the list of affected downstream modules(classes or methods)with respect to a specific upstream bug,which helps the upstream developers to understand the bug impact range and severity,as well as know to whom they should communicate with.Our third work is to design a cross-project regression testing framework tailored for GitHub ecosys-tem,which is used to select the most relavent downstream test cases for complimenting the upstream test suites and detecting potential cross-project bugs.In conclusion,the main contributions of this thesis are summarized as follows:(1)In order to make a deep understanding of how developers deal with cross-project bugs,we conduct an empirical study on its specific form,cross-project corre-lated bugs,i.e.,causally related bugs reported to different projects,focusing on three research questions:1)the difficulty of finding the root cause of cross-project corre-lated bugs;2)beneficial factors for tracking the root cause;and 3)the coordination between downstream developers and upstream developers when fixing cross-project bugs.Through manual inspection of 271 pairs of bug reports collected from the scien-tific Python ecosystem and an online survey with 116 developers,this study reveals the common practices of developers and the various factors in fixing cross-project bugs.These findings provide implications for future software bug analysis in the scope of ecosystem,as well as shed light on the requirements of issue trackers for such bugs(2)We present an approach to estimating the impact of a cross-project bug within its ecosystem by identifying the affected downstream modules.Note that a downstream project that uses a buggy upstream function may not be affected as the usage does not satisfy the failure inducing preconditions.For a reported bug with the known root cause function and failure inducing preconditions,we first collect the candidate downstream modules through an ecosystem-wide dependence analysis.Then,the paths to the call sites of the buggy upstream function are encoded as symbolic constraints.Solving the constraints,together with the failure inducing preconditions,identifies the affected downstream modules.Our evaluation on the scientific Python ecosystem shows that the approach is highly effective:from 25490 candidate downstream modules,it identifies totally 1132 modules that are affected by 31 bugs,pruning 95.6%of the candidates in total and 66.5%for individual bugs on average.(3)We propose an cross-project regression testing framework EcoTest for GitHub ecosystems and design a two-step test selection stratege.EcoTest consists of three components:1)the central repository which processes and stores ecosystem-wide fine-grained dependencies,test execution traces,and historical build logs;2)the two-step test selector implementing regression test selection(RTS)strategies;and 3)ntc which runs the selected tests and notifies relevant projects of the results.Our evaluation is conducted on 133 popular projects in the scientific Python ecosystem.By evaluating four kinds of RTS strategies(simple,coverage-based,centrality-based,and history-based),EcoTest is shown to be cost-effective to complement upstream test suites and find bugs.We believe that our findings are helpful for a deeper understanding of cross-project bugs.The supporting techniques and tools can effectively help the fixing and detection of such bugs,as well as the assurance of ecosystem's health.
Keywords/Search Tags:Software ecosystem, GitHub, cross-project bugs, fixing process, impact analysis, regression testing
PDF Full Text Request
Related items