Font Size: a A A

Research On Software Defect Prediction Based On Machine Learning Algorithm

Posted on:2019-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2428330548969567Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Software defect prediction is one of the most active research fields in software engineering.Defect prediction models can provide error-prone source code components or changes,etc.,so that quality assurance teams can effectively allocate limited resources to validate software products by increasing their efforts to address error-prone source code.With the expansion of software project scale,defect prediction technology will play a crucial role in the work of developers,it helps developers to develop more reliable software products,so as to speed up the market.This paper studies and analyzes software defect prediction technology from the perspective of machine learning.At present,machine learning based defect prediction has developed many methods,and most of them are effective in project prediction,but there are also some problems:(1)the performance of cross-project defect prediction is usually poor,which is mainly due to the difference of feature distribution between source and target projects,while most machine learning classifiers assume that training and test data are represented in the same feature space and come from the same data distribution.(2)many studies focus on prediction defects at the coarse-grained level,such as files,packages,or modules.(3)if the imbalance of the defect data set is not handled properly,the learned defect prediction model will be biased to most defect-free classes,which seriously affects the prediction performance of the prediction model.Based on the above problems,this paper studies the most advanced methods of software defect prediction technology based on machine learning,including transfer component analysis,deep learning and other methods,and based on this,puts forward innovative practical technology,provides effective methods for solving the above problems in software defect prediction,mainly including:1.An improved transfer component analysis method is proposed,which is based on the transfer learning and adjusts the migration defect learning proposed by the existing transfer component analysis(TCA).TCA is used to map the data of source and target projects to the potential feature space,and to transform the data based on the new feature representation,so as to reduce the data distribution difference in cross-project prediction.In addition,considering that the performance of cross-project defect prediction using TCA is sensitive to normalization,the decision rule of selecting the appropriate normalization method for the source project and target project pair in TCA + is applied to solve the problem.At the same time,the learning model with better cross-project prediction performance is obtained by changing the underlying machine learning classifier.The experimental results show that the prediction model obtained by this method has a high f-measure value.2.In order to solve the imbalance problem of defect data set,TCA principle is used to preprocess the data.In this method,TCA technology is used to map the data sets of two different projects to a potential space.when the potential space is found,a few kinds of data in the source project are selected and added to the target project as the training set of the target project,so that the number of defective instances and non-defective instances in the target project are equal.At this time,the imbalance of the data sets is improved.When training in the target project(in-project training defect prediction model),we need to map all the data in the target project to the potential space,that is,the transformed instance data is used as the training data of the training model.The method solves the imbalance of the defect data set to some extent,and guarantees the performance of the next training defect prediction model.3.A just-in-time defect prediction method based on improved deep belief network is proposed.This method focuses on changing this fine-grained research,and combines with deep belief network,a deep learning method,to propose a just-in-time defect prediction technique.The study found that just-in-time defect prediction is more useful because it examines a smaller amount of code,only a single change rather than an entire file or package,and also makes it easier for developers to assign fixes to defects.More importantly,this paper combines deep belief network technology with fine-grained just-in-time defect prediction and TCA technology to deal with data imbalance,proposes an improved just-in-time defect prediction method of deep belief network,and tries to change the machine learning classifier at the bottom of DBN to do different experimental research.The results not only verify the effectiveness of TCA technology to solve the problem of data imbalance,but also show that the defect prediction model learned by the improved method has better prediction performance.
Keywords/Search Tags:Software Defect Prediction, Transfer Learning, Transfer Component Analysis, Deep Belief Network, Just-In-Time Defect Prediction
PDF Full Text Request
Related items