| Software debugging and testing is more difficult than it has been before, due to the increasing size and complexity of software today. Software debugging consists of fault localization and fault repair, of which fault localization aims at identifying which elements could contain a fault. CBFL techniques use the coverage information and error information obtained during test cases’execution to locate the fault. The CBFL approaches assign to each executable element a suspiciousness value, relative probability of the element containing a fault. Then the programmers rank the executable statements by their suspiciousness. The positions of the faulty statements in the rank list are always used to evaluate the performance of fault localization techniques. A good fault localization technique can place the faulty statements in high order, which means only a few statements should be checked to locate the faults.There have been considerable researches on CBFL techniques and some researchers have proposed different approaches to improve the effectiveness of fault localization. Gong Dandan et al. introduced a test-suite reduction approach for fault localization effectiveness. Vidroha Debroy et al. proposed a statement grouping strategy to improve the effectiveness of fault localization techniques. In all the above approaches, researchers did not take into account the influence of coincidentally correct tests, which means tests that executed the faulty statement but detected no error during the execution of the test. In recent years, some researchers pointed out that coincidental correctness occurs frequently and in most cases coincidental correct tests have negative effect on the performance of fault localization techniques.Our research focuses on improving fault localization by identifying and removing coincidental correct tests. Particularly speaking, given a test suite, including pass tests and failed tests, there are three steps to identify coincidental correct tests:Firstly, divide the tests into two subsets:pass tests and failed tests; Secondly, cluster the pass tests into two clusters by K-means clustering; Besides, select the cluster with higher similarity to the failed tests as the subset of coincidental correct tests; Finally, apply two strategies to handle the recognized coincidental correct tests, and the feasibility of each strategy is proved by theoretical analysis and empirical study. Our techniques are evaluated from two aspects, the ability of our approach recognizing coincidental correct tests and the improvement of our approach on fault localization. Our experimental results show that our recognition approaches could help identify coincidental correct tests, and both strategies can improve the performance of fault localization. |