Font Size: a A A

Research On Multi-factor Test Flakiness Detection

Posted on:2022-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ShiFull Text:PDF
GTID:2518306569996599Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasing size of software in recent years,maintaining the stability of the software has become a huge challenge,of which flaky test is a typical example of this challenge.A flaky test is a test in which the same test is run multiple times without changing any code or configuration,and the results obtained are not unique(both successes and failures).Flaky tests reduce trust in testing and destroy the value of testing because it is often assumed that code that passes a test is passable code,but the presence of flaky tests makes it hard to judge the quality of the code by the test.The traditional way to detect flaky tests is to re-run the test several times,and if the result is not unique,the test is considered flaky,but this method consumes a lot of resources and slows down software development,so now the industry urgently needs a better way to detect flaky tests.Flaky test is a field that has only been systematically studied in recent years,so there are few existing test flakiness detection techniques,and the purpose of this thesis is to develop more lightweight,efficient,and accurate test flakiness detection methods.In this thesis,two test flakiness detection methods are proposed,which are traceback coverage and multi-factor detection,the former is suitable for use in the case where the test passed the last run and has higher accuracy,while the latter requires a dataset to train the KNN model and has a lower accuracy than the former.The multi-factor detection does not require the test to pass the prerequisite of the previous run.Both methods have advantages and disadvantages.Multi-factor includes five factors: traceback coverage,flakinessinducing test smells,test size,flaky frequency,and last run state.Flakiness-inducing test smells are bad design or improper implementation that may cause the test flaky when writing test.and this thesis identified 15 flaky test smells from Python.In this thesis,1277 flaky tests and 732 non-flaky tests are obtained from three Python projects through local re-run tests,and then the data are used to verify the effectiveness of the traceback coverage and multi-factor detection.The results show that the accuracy of the traceback coverage is 96.5%,and the accuracy of the multi-factor detection is 87%.However,the traceback coverage needs the precondition that the test previously passed to achieve a good result.Besides,this thesis analyzed the correlation between the flaky test smell,test size,instability frequency,and flaky test based on the data,and finds that there is no significant relationship between test size and flaky test,and the more the flaky test smells and the higher the flaky frequency the test is more likely to be a flaky test,but this correlation is much weaker than the traceback coverage.This thesis ended with the development of a system to detect flaky tests in Python based on traceback coverage and multi-factor detection,the system can automatically obtain and parse the data needed to detect flaky tests from Travis CI and Github and then detects whether a failed test is a flaky test or not.
Keywords/Search Tags:flaky test, test smell, KNN, multi-factor flakiness detection, traceback coverage
PDF Full Text Request
Related items