Font Size: a A A

Data Mining Software-based Fault Location Techniques

Posted on:2015-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:H B ChenFull Text:PDF
GTID:2268330425488116Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous expansion of software applications, software is becoming increasingly large-scale and the structures of software products are becoming more and more complex. Software failures hidden in Complex softwares are difficult to locate, resulting in the unreliability of softwares. When the software fails, how to locate software failures quickly and efficiently is a very valuable research direction. However, existing debugging techniques are mostly running the program by setting breakpoints when debugging a failed run using only one dynamic execution information, while ignoring the massive test data collected by large number of test cases. So if we take effective use of these vast amounts of test data together to discover fault-related information, it will be more effective for fault location.The research of this paper finds that association rule mining algorithm of data mining technique can be applied to software fault localization based on the research of Tarantula.We can use association rule mining algorithms to find rules like(e (i), F), which means statements that cause the program to fail. Take lift(e (i), F) as the suspicious rate of an executable statements, then put the executable statements in descending order according to the value of lift(e(i),F), finally we can get the fault reports in a descending order with respect to the suspicious rate.Then, for the Tarantula method only uses the times of statement executed successfully and unsuccessfully to measure the suspicious rate of statements while ignoring the supporting role of the execution complement of statements. This paper proposes a method which take the execution complement information to improve the degree of lift(e(i),F), denoted as complement_lift’.Finally, take experiments on the109versions of the Siemens test suits using association rules for calculating lift, the improved method complement_lift’,Tarantula, Wong and Dice. Then compare the five methods with charts. Experimental results show that:the improved association rules based on the execution complement of statements is superior to the other four methods, programmers need to check less statements than the other four methods when finding the fault statements; association rules for calculating lift methods is better than Tarantula method, have the same effect as Dice method, but is slightly inferior to Wong method. And further analysis on large assemblies Space shows that the fault localization based on association rules of data mining can be more effective in identifying large program failures.
Keywords/Search Tags:fault localization, association rules, execution complement of statements, Gcov, Siemens test suits, Tarantula
PDF Full Text Request
Related items