Research On Cross-Project Software Defect Prediction Method Based On Machine Learning

Posted on:2024-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:W T Lin

Full Text:PDF

GTID:2568306914972449

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Software defect prediction is an effective means to guarantee software quality.However,due to the lack of training data,cross-project software defect prediction is needed to make up for this deficiency in practical application.In cross-project defect prediction,this thesis use non-project data to train the model,and use software defect prediction to alleviate the problem of insufficient data.Current cross-project defect prediction research has two key problems to be solved.One is the difference between the characteristics of the source project and the target project.Another problem is class imbalance.This thesis studies these two key problems.The research content is as follows.(1)This thesis proposes a cross-project defect prediction model based on feature selection and ensemble learning.The feature selection method based on Pearson correlation coefficient is used in the domain adaptation phase.This method looks for similar features between source items and target items so as to reduce the difference in feature distribution between source items and target items.In the classification stage,this model uses the majority voting method and integrates several representative base classifiers.The effect of class unbalance can be reduced by using the characteristic of mutual correction among the classifiers of majority voting method.(2)This thesis proposes a cross-item defect prediction model based on two-stage feature amplification.In this model,the idea of semisupervision is introduced in the domain adaptation stage,and the feature search technology based on greedy optimal search is used to carry out feature migration,amplification and class features with strong correlation.In this way,the relationship between classes can be considered on the basis of learning the relationship between characteristics and classes.This model makes the feature distribution between source and target items more similar by constructing a target item-specific feature space.In the classification stage,random forest,an integrated learning method sensitive to the feature-class relationship,is selected as the classifier to further amplify the role of features in classification and reduce the influence of class imbalance.According to the above research content,this thesis designed 7 experiments on the public data set AEEEM for the two proposed methods,a total of 140 groups of experiments to test the proposed model.The ablation experiment and comparison experiment were included.The experimental results show that the method proposed in this thesis can ameliorate the influence caused by the difference of feature distribution and class unbalance between source items and target items.The method proposed in this thesis can improve the overall performance of crossproject defect prediction model.

Keywords/Search Tags:

cross-project defect prediction, machine learning, domain adaptation, feature distribution difference, class imbalance

PDF Full Text Request

Related items

1	Research On Software Defect Prediction Based On Improved Balanced Distribution Adaption Algorithm
2	Research On Class-imbalanced Cross-project Defect Prediction Based On Adversarial Learning
3	Research On Key Technologies Of Cross-Project Heterogeneous Defect Prediction
4	Research On Software Defect Prediction Based On Extreme Learning Machine
5	Study On Cross-Project Defect Prediction Based On Transfer Learning
6	Research On Data Preprocessing Technology In Cross Project Software Defect Prediction
7	Research On Cross-project Software Defect Prediction Method Based On Active Learning
8	Research On Prediction Method Of Software Defect Quantity Based On Machine Learning
9	Research On Cross-Project Software Defect Prediction Via Knowledge Transferring
10	Design And Tool Implementation Of Cross-project Software Defect Prediction Method Via Active Transfer Learning