Font Size: a A A

Research On Cross-Project Software Defect Prediction Based On Multi-Source Transfer Learning

Posted on:2022-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2518306335458374Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the industrial upgrading of my country's software industry and the increasing requirements of the market for software quality,how to ensure that software products can have high reliability and stability has become an important issue that needs to be solved urgently.Traditional software defect detection methods are often human-driven,a model that is inefficient and economically costly.With the promotion of machine learning theory and applications,data-driven software defect detection methods have slowly replaced human detection.Machine learning-based software defect detection methods can detect software defects in a timely manner,improve software quality,optimise test resource allocation and are important for saving maintenance costs.However,early research focused on defect prediction for software projects with rich historical data.For newly developed software projects or those lacking historical data,the lack of sufficient annotated code has led to a 'cold start' problem for defect prediction.To address this problem,the use of migration learning to solve cross-project defect prediction has become a hot topic of current research.At the same time,how to filter the valid source items and generate a better single-source migration learning model based on them,and how to integrate multiple source domain classifiers are key factors affecting the effectiveness of defect prediction models.This paper proposes a cross-project software defect prediction method(Multi Source TrAda Boost based on deep learning,MTrADL)based on the above two problems combined with the ideas of data selection and integrated learning,and the algorithm consists of two stages: 1)A non-parametric similarity measure based on an improved maximum mean discrepancy approach to filter the top K most similar source projects for the target project,aiming to generate better base classifiers for each source domain;2)A cross-project software defect prediction based on multi-source information integration,aiming to fuse multiple base classifiers to generate an integrated learning model with strong generalization capability and high robustness for the target domain,in order to compensate for the lack of information due to individual The aim is to fuse multiple underlying classifiers to generate a highly robust,integrated learning model for the target domain with strong generalisation capability to compensate for the one-sidedness of information acquisition due to individual project migration.A convolutional neural network is used as the base learner for multiple defect data to build a unique prediction result about the presence of defects in the code based on the integration idea.In this paper,experiments are conducted using the classical defect dataset PROMISE,and the experiments prove that the MTr ADL algorithm can obtain better defect prediction performance,which overall exceeds the common defect prediction algorithms.Meanwhile,the experimental results prove that the source project selection method and the convolutional neural network as the base classifier proposed in this paper are effective,and that the convolutional neural network plays a more important role in the MTr ADL model.
Keywords/Search Tags:Cross-project defect prediction, Similarity metrics, Multi-source transfer learning, Deep learning, Ensemble Learning
PDF Full Text Request
Related items