With the increasing of software scale and complexity,it is more and more possible to introduce defects into the software development process.The existence of software defects may lead to software runtime collapse,and even endanger the safety of people’s lives and property.If defects can be found before software is released,then development teams can allocate testing resources reasonably and effectively,reduce development costs and improve software quality.In the actual development,it is often necessary to predict a new project,or the project has little labeled data.In this case,heterogeneous projects software defect prediction came into being.In recent years,the introduction of transfer learning has reduced the feature difference of heterogeneous software defect prediction.However,there are still some problems in the existing methods for defect prediction of heterogeneous projects: class imbalance,redundant or irrelevant features,insufficient information of single source,data islands.Through the above analysis and research,this paper designs and realizes the heterogeneous software defect prediction method based on transfer learning.A heterogeneous defect prediction based on manifold transfer learning is designed and implemented.Firstly,the source project uses sampling with the majority,and new defect samples are generated on the density curve to balance the dataset.Secondly,the importance of each feature and the similarity between datasets are calculated by using extreme gradient boosting and Las Vegas to remove irrelevant and redundant features.Finally,the transfer learning of the source project and the target project is completed in the manifold space,and the maximum correlation between the source project and the target project avoids the distortion in the original space.Logistic regression classification technology is used to predict the target project.Experimental results show that this method not only solves the problem of heterogeneity,but also reduces the influence of class imbalance and metrics redundancy on the prediction model and improves the prediction performance.A heterogeneous defect prediction method based on federated transfer learning is designed and implemented It realizes the communication of private models by knowledge distillation,and shares information through public project with Softmax of private models.The private model of each participant is obtained from the pre-training model of public data through transfer learning and fine-tuning to solve the problem of insufficient labels and heterogeneity.Before the communication between the parties,the private data of each party is encrypted with homomorphism encryption through secret sharing technology,which ensures the privacy security of the data and does not affect the prediction results.The experimental results show that the proposed method can achieve better prediction performance. |