The Research Of Fine-grained Code Change Bug Prediction In Cross-projects

Posted on:2018-10-04

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Zuo

Full Text:PDF

GTID:2348330515997928

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In the process of software development,the bug-searching-and-repairing is an important part of ensuring the quality of software.However,this process is time-consuming and laborious.If we can predict immediately whether it will introduce a bug and then repair quickly at the phase of software testing,even at the time when finishing doing a change in code,it has to be a significant improvement for the efficiency of software development.There are always some version control system tools to record the historical evolution of information-of course,including the process of bug fixing-in the process of software development.The idea of machine learning is that through the existing information and knowledge to speculate and evaluate the new things.With the idea of machine learning,we can find out the bug-introducing changes based on the bug-fixing changes in the process of bug fixing,and then based on these changes with bug tendencies and their attributes,summarize and learn the information of what kind of changes may introduce a bug.After learning the knowledge,we can speculate on new code changes' bug tendencies according to its attribute eigenvalue at later software development.Mature projects can generally provide rich historical information for machine learning,learning the knowledge of speculating bug tendency of the code change.But for some new projects,or some projects within special area,these projects do not have enough historical data,or these historical data is difficult to collect,so we can use the idea of cross-project learning,learn the knowledge in data-rich projects and apply it to projects that are data-poor.At present,there are many studies for cross-project bug prediction,but the general predictive effect is not ideal.The prediction model learned from one project is not applicable in another project.The reason for this phenomenon is that the difference between two datasets in two projects is too large,and the traditional machine learning method is applicable when the training set and the testing set are consistent.Second,these studies mostly focus on file-level changes to predict,the granularity is large,so it can cause a problem that when detemine a bug-prone file,the developers still have to spend some time to find out the special code line that introduce a bug,then this file get large,the cost is unacceptable.This paper starts with the historical evolution information of the software development,converts the file-level code changes to fine-grained code change entities,and predicts bug trend from a fine-grained perspective.As the reason for the poor performance of cross-project bug prediction,that is,the data distribution between projects is inconsistent,this paper get learned from the idea of transfer learing and propose a transfer bug prediction learning method,by mapping the data sets of two projects into a potential feature space to minimize the difference of data distribution between the projects,and then apply the traditional machine learning-Classification learning-to predict the bug tendency of code changes.At the same time,in order to further improve the effect of bug prediction,this paper also proposes the feature selection method,normalized selection method and the class unbalanced processing method to improve the prediction performance step by step.In the experiment,in order to compare with other methods,this paper chooses the dataset ReLink and the dataset AEEEM.Three sets of control experiments were designed.The comparison of the feature selection method,the normalization selection method and the class unbalanced processing method were made to illustrate the improvement of the bug prediction performance.Meanwhile,through the comparison with other methods,this paper illustrates the effectiveness of our cross-project bug prediction method.Finally,we evaluate empirically the advantages of the use of fine-grained level of code changes in the bug prediction.

Keywords/Search Tags:

Fine-grained Code Change, Transfer Bug Prediction Learning, Feature Selection, Normalization Selection, Class Unbalanced Processing

PDF Full Text Request

Related items

1	Research On Fine-grained Software Defect Prediction Based On Deep Learning
2	Research On Software Defect Prediction Based On Feature Selection And Instance Transfer
3	Clone Code Harmfulness Prediction Research Of Unbalanced Classification And Feature Selection Problem
4	Research And Application Of Feature Selection Algorithm Based On Dynamic Weights Using Redundancy
5	Researches On Software Defect Prediction Methods Under Different Scenarios
6	Deep Active Learning For Fine-grained Visual Categorization
7	Fine-grained Image Classification With Click Prediction
8	Fine-grained Image Classification Based On Feature Selection And Multi-scale Feature Fusion
9	Text Feature Selection For Transfer Learning
10	Researches And Applies On Software Defect Prediction Method Based On Ensemble Learning