Cross-Project Software Defect Prediction Methods Based On Autoencoder

Posted on:2021-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:J J Li

Full Text:PDF

GTID:2428330614465817

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

Software defect prediction(SDP)has been a hot research topic in software engineering.Its main goal is to discover defects existing in the software for improving the software quality.The previous researches mainly focused on within-project defect prediction(WPDP),mainly use the historical data of one project to train a prediction model and test the defect proneness of software instances from the same project.However,when there is not enough historical data available in the same project,the performance of WPDP becomes significantly poor.Cross-project defect prediction(CPDP)as a new solution,CPDP builds a prediction model by using plenty of historical data from other project and predicting defects in a new project instances.However,its prediction performance is usually poor,because of the data distribution difference between source and target projects,and the class imbalance problem.Based on these two problems,deep autoencoder technology is applied in CPDP,and three different methods are proposed to improve the performance of defect prediction.Firstly,to solve the data distribution difference problem,a shared hidden layer autoencoder for cross-project defect prediction(SHLA-SDP)method is proposed.SHLA-SDP first designs a network structure of shared hidden layer autoencoder,which can effectively reduce the feature distribution difference between source and target projects by using hidden layer parameter sharing mechanism.Then an intra-class compactness loss function is designed to effectively constrain the features in the common subspace of the hidden layer,thus improving the compactness of the intra-class features.Finally,the deep features of source project are used to construct the defect prediction model,the accuracy of the defect prediction model is improved.Secondly,in order to solve the problems of class imbalance and less labeled data,a semisupervised cost-sensitive improved autoencoder for cross-project defect prediction(CSSHLA-SDP)method is proposed.CSSHLA-SDP combines supervised learning and unsupervised learning in the training of deep autoencoder.It adds the intra-class compactness loss to the supervised part and the reconstruction loss to the unsupervised part during the training process.Besides,cost-sensitive learning technology is introduced to effectively alleviate the class imbalance problem.Its approach is different kinds of samples are assigned different misclassification cost values.The performance of defect prediction is further improved.Finally,in order to obtain intra-class features with better compactness and inter-class features with better separation,an improved focal loss based autoencoder for cross-project defect prediction(FLSHLA-SDP)method is proposed.In the training of deep autoencoder,FLSHLA-SDP utilizes intra-class compactness loss and inter-class separation loss to make the distribution of source projects and target projects more similar in common subspace.In addition,a better focal loss function is used to deal with the class imbalance problem by combining class weighting and difficulty weighting.First,different weights are applied to different classes of samples,and then considering the difficulty classification degree of the sample,different weights are applied to the difficult classification sample and the easy classification sample.Compared with the 5 classical comparision algorithms for CPDP,the experiments of the 3 methods proposed in this paper improved the performance of defect prediction on RELINK,NASA and AEEEM datasets.

Keywords/Search Tags:

Deep autoencoder, shared hidden layer mechanism, cost-sensitive learning, focal loss, cross-project defect prediction

PDF Full Text Request

Related items

1	Static Metrics Based Cross-Project Software Defect Prediction
2	Research On Software Defect Prediction Method Based On Cost Sensitive Learning
3	Research On Software Defect Prediction Algorithm Based On Cost-sensitive Learning
4	Cross-project Software Defect Prediction Based On Deep Learning
5	Study On Cross-Project Defect Prediction Based On Transfer Learning
6	Research On Software Defect Prediction Method Based On Cost Sensitive Learning Adacost
7	Research On Cross-Project Software Defect Prediction Based On Multi-Source Transfer Learning
8	Research On Cross-Project Software Defect Prediction
9	Research On Cross-Project Software Defect Prediction
10	Research On Cross Project Software Defect Prediction Based On Feature Transfer And Instance Transfer