Font Size: a A A

Research On Just-in-time Software Defect Prediction Method Based On Learning To Rank

Posted on:2021-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:W Q PengFull Text:PDF
GTID:2428330620472610Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the relationship between the Internet and human production and life is getting closer and closer,the software industry is in a period of rapid development,and at the same time,software quality assurance work is particularly important.Software defect prediction technology can help software testers to find possible defects in the software system.Identifying and repairing defects in the software system at an early stage can greatly save the cost of software development and reduce the difficulty of maintaining the software system at a later stage.The traditional software defect prediction is generally for coarse-grained software modules,such as class files.For some complex software systems,a certain code source file may be large and submitted by multiple people.It takes time and effort to detect.Different from the traditional software defect prediction,the object of just-in-time software defect prediction is the code change submitted by the developer every time,which can be detected immediately after submission,so it has the characteristics of fine-grained and immediate.This thesis focuses on how to build a better and more reasonable just-in-time software defect prediction model,and conducts in-depth research from two aspects.(1)When using a classification algorithm to build a just-in-time software defect prediction model,the classification algorithm first predicts the class label of the code change or the probability of the code change being defective,then calculates the relative defect density of the code change,and finally sorts the code change based on the relative defect density.Researchers have given three different definitions of calculating the relative defect density of code changes under the classification model.In order to explore which method of calculating the relative defect density of code changes is more effective,and how different calculation methods affect the performance of the defect prediction model,this thesis explores three methods of calculating the relative defect density in terms of the four performance measures under the three validation methods(i.e.,Out-of-sample Bootstrap validation,cross-project validation,and time-wise validation).The experimental results show that the first method and the third method of calculating relative defect density are the most effective.The first calculation method defines the relative defect density as the ratio of the class label to the size of the code change.When the predicted possibility of the code change being defective is greater than or equal to 0.5,the third calculation method defines the relative defect density as the ratio of the probability of being defective to the size of the code change,and when the predicted possibility of the code change being defective is less than 0.5,the third calculationmethod defines the relative defect density as the ratio of probability of containing defects minus 1 to the size of the code change.The experimental results also show that the first calculation method is beneficial to Recall @ 20% and Popt of the model,and the third calculation method is beneficial to F1 @ 20% and Precision @ 20% of the model.(2)Aiming at the problem of poor prediction ranking when using classification algorithm or regression algorithm to build just-in-time software prediction model,this thesis proposes a just-in-time software defect prediction method based on learning to rank.This method uses a multi-objective optimization algorithm to directly optimize the effort aware performance measures of multiple code changes,and obtains the parameter solution of the prediction model.This method tends to rank the code changes with high relative defect density at the forefront.In the case of limited test resources,code changes with high relative defect density can be detected first.In order to evaluate the performance of this method,this thesis conducts empirical research on 6 public project data sets with a total of 227417 code changes under the three validation methods of Out-of-sample Boot Strap validation,cross-project validation,and time-wise validation,and compares the proposed method with the 15 commonly used benchmark methods in terms of Recall @ 20%,Precision @ 20%,F1 @ 20%,Popt and the ratio of the proportion of inspected bugs to the proportion of change inspected.The experimental results show that the proposed method based on learning to rank can achieve better prediction results.
Keywords/Search Tags:Just-in-time software defect prediction, Learning to Rank, Multi-objective optimization, Relative defect density
PDF Full Text Request
Related items