Font Size: a A A

Research On Fault Localization And Program Repair Using Propagation Chain

Posted on:2017-01-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:W S LiFull Text:PDF
GTID:1318330536967130Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the inevitable product of software design and maintenance,the number of fault is increasing with the increasing with the scale and complexity of software system.So researchers put attantion to the automation of fault localization and program repair,which is very important to improve the efficiency of software testing and software quality.Over the years,the academia and industry are committing to the research of automated fault localization and program repair techniques,and have made many achieveme.However,due to the complexity mechanism of fault,these techniques also face many challenges.Based on the process of error propagation during software execution and the three steps(fault localization,patch generation and patch verification)of program repair,this paper studies and analyzes the reapir results of two existing automated program repair techniques,and plans to 1)study the optimization method of fault localization techniques by clustering test cases,2)investigate the suspicious code set filtering approach that can greatly increasing repair efficiency,and 3)iterative repair technique based on grouping of test cases.This paper also presents a serial of techniques aiming to improve the effectiveness of fault localization and the effectiveness and efficiency of automated program repair,and justifies the advantage of each technique via the experiments on the real-life,large-scale faults in open source,and demonstrate its effectiveness and efficiency using rigorous statistical approaches.The key contributions of this paper are listed as follows.1)Analyze the repair results of the existing automated program repair tools by studying the propagation chain from fault to failureFault tolerance and Fault removal are two main ways to improve the reliability and security of software.Existing automated program repair techniques focus on improving software reliability by generating fixed patches which can avoid failure,but ignore the research on these generated patches.In this paper,we collected the fixed patches that are generated by two main automated program repair tools-Genprog and CETI,and analyzed the physical and logical relationship between the code which are modified by patches and the fault,so as to find whether a fixed patch is a tendency to Fault tolerance or Fault removal.In addition,this paper also put forward the concept of error complexity,and defined an error‘s complexity by two ways: the number of code lines that are involved by a fault,and the length of the propagation chain from fault to failure.Finally,we analyzed the relationship between error complexity and its repair result(Fault tolerance or Fault removal).Analysis results show that,the repair result of existing automated program repair tools is more likely to be Fault tolerance rather than Fault removal,and the lower the complexity of error is,the higher probability of repair result belongs to Fault removal,and on the contrary,the higher the complexity of error is,the higher probability of repair result belongs to Fault tolerance.It also guides the follow-up of the improvement and optimization of the existing fault localization technology and automated program repair technology,which is based on the premise of the full realization of Fault tolerance.2)Presenting an optimization method for fault localization based on test case clustering.Statistical-based fault localization(SFL)is one of the most popular techniques for fault localization.However,its accuracy may be greatly affected by coincidental correctness.Coincidental correctness occurs when faulty elements are executed,but the program produces the correct output,which means fault was not triggered or the error was not propagated to failure.Because SFL usually uses the coverage information and results of test cases to calculate the suspicious value of program elements to be fault,the more coincidentally correct test cases used,the more likely that the accuracy of SFL is affected.In this paper,by use SFL to localizated the faults of some programs,we found that the error propagation chains of coincidentally correct test cases are very similar.Based on this observation,we assume that there is a high degree of similarity between coincidentally correct test cases and we proposed an optimization method of fault localization based on test case clustering – TSMR(Test case Selection and Matrix Reconsitution).By clustering all test cases and reconstructing the coverage matrix,TSMR can weak the effect of coincidental correctness to the accuracy of SFL as soon as possible.The experimental results show that,TSMR can effectively improve the accuracy of SFL in most cases.3)Presenting an optimization method for automated program repair based on Suspicious Faulty Code Snippet FilteringAutomated program repair techniques often generated fixed patches by modifying one or more program elements.These elements that can be modified were often obtain by SBFL and be called as Suspicious Faulty Code Snippet.Currently,Suspicious Faulty Code Snippet obtained by traditional fault localization techniques often contains many elements which are unrelated to success repair and reduces the efficiency of automated repair.In this paper we propose a filtering approach,SFCSF(Suspicious Faulty Code Snippet Filtering),which can effectively reduce the size of the suspicious code snippet while ensuring the quality of repair.SFCSF obtained suspicious value of all program elements by SBFL and the error propagation chains by Program Slicing techniques,and then SFCSF picks up some elements,of which the suspicious values are different to the suspicipus value of their following elements in error propagation chains,from Suspicious Faulty Code Snippet,finally SFCSF uses these selected statements for automated repairing.The experimental results show that SFCSF can greatly improve the efficiency of automated program repair in most cases.And we also found that the validity of SFCSF is also restricted by test suite: if the spectrum of all test cases are quiet similar,SFCSF may fail.4)Presenting an iterative repair technique based on test case groupingDuring patch validation,automated program repair techniques often used test suite to verify the validity of the generated patches: only after running all test cases successful and obtaining correct output,the generated patch can be considered as correct fixed patch.This verification indicates that existing automated program repair techniques still generate and verify patches based on the idea of Fault removal,which is in conflict with our previous conclusion that most fixed patches generated by existing automated program repair techniques are belong to Fault tolerance.This contradiction makes a great impact on the success rate and efficiency of automated repair,especially for multi-point bugs,the repair effect of the existing automatic repair technology is not ideal.From the view of Fault tolerance,this paper proposed a n iterative repair technique based on test case grouping.According to the execution information,the test cases are grouped,and the program is repaired based on the iteration method: the goal of iteration is only to generate an intermediate patch that can satisfy one or more groups of test cases,and obtain the final fixed path that can satisfy all test cases after several iterations.The experimental results show that this method can greatly improve the ability of program repair,especially for the multi-point bugs‘ repair.
Keywords/Search Tags:fault localization, automated program repair, error propagation chain, fault tolerance, fault removal, error complexity, coincidental correctness, clustering, suspicious faulty code snippet filtering, group-based iterative repair
PDF Full Text Request
Related items