Research On Data Preprocessing Technology In Cross Project Software Defect Prediction

Posted on:2022-12-07

Degree:Master

Type:Thesis

Country:China

Candidate:T Zhang

Full Text:PDF

GTID:2518306749458174

Subject:Art

Abstract/Summary:

PDF Full Text Request

With the continuous upgrading and development of computer Internet,people's demand for software is increasing day by day.Although software can provide great convenience for people in daily life,the huge cost caused by software defects has dealt a heavy blow to people.Software defect prediction technology is one of the important means to solve the problem of software defects.With the application of machine learning in the field of software defect prediction,good results have been achieved in the same project.However,compared with the same project,cross project software defect prediction technology has more practical significance.In the research process of cross project software defect prediction,it is found that directly using a large amount of data for cross project software defect prediction often produces poor prediction results,which is due to the problems of class imbalance and feature difference in cross project software defect prediction.Data preprocessing technology can alleviate the problems of class imbalance and feature difference,so data preprocessing technology is very important in cross project software defect prediction.The main work of this paper is as follows:(1)Aiming at the problem of feature difference,this paper proposes a filtered feature selection method cpfrfs(cross project of feature selection and feature redundancy).Through this method,the feature set with low feature redundancy and high feature similarity can be screened,and the number of migrated feature sets constructed by this feature set is less than that of the original feature set,This improves the effect of cross project software defect prediction.(2)Aiming at the problem of class imbalance,this paper proposes a hybrid sampling method msksmote(K-means mixed smote method).This method can delete noise points,eliminate most fuzzy class data on boundary points and add a few class data on boundary points,which can make boundary points clearer,so as to achieve data balance.(3)In order to further improve the effect of cross project software defect prediction,this paper combines msksmote algorithm and cpfrfs algorithm to propose a cross project software defect prediction model of MSK + CP.Firstly,msksmote mixed sampling method is applied to the data set,and then cpfrfs algorithm is used to screen the optimal feature set.Experimental results show that the algorithm can achieve better results in F1 value than the classical cross project software defect prediction algorithm.

Keywords/Search Tags:

software defect prediction, feature transfer, feature select, class imbalance

PDF Full Text Request

Related items

1	Research On Software Defect Prediction Based On Feature Selection And Instance Transfer
2	Research On Software Defect Prediction Method Based On Feature Selection
3	Research On High-dimensional Data Processing In Software Defect Prediction
4	Research On Software Defect Prediction Method Based On Fusion Feature Selection And Ensemble Learning
5	Researches And Applies On Software Defect Prediction Method Based On Ensemble Learning
6	Cross-project Software Defect Prediction Based On Feature Transfer
7	Research On Cross Project Software Defect Prediction Based On Feature Transfer And Instance Transfer
8	Research On Software Defect Prediction Method Based On Cost Sensitive Learning Adacost
9	Research On Software Defect Prediction Method Based On Semi-supervised Integration
10	Research On Heterogeneous Software Defect Prediction Based On Transfer Learning