Font Size: a A A

Research On Software Defect Prediction Methods

Posted on:2020-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:J SunFull Text:PDF
GTID:2428330590495406Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Software defect prediction is a technique that uses the historical data of software development to predict the defects of software modules.By means of the results predicted by the model,test resources can be reasonably allocated to ensure the development efficiency and software quality.However,in practical application scenarios,there is often a lack of sufficient historical data for the establishment of prediction models.In order to solve this problem,two solutions are proposed in this paper.The deep ladder network makes use of the unlabeled historical data and a small amount of labeled historical data,and the prediction accuracy becomes better compared with the traditional semisupervised and supervised prediction.On the other hand,the knowledge learned by transfer learning can be used to transfer and apply the historical data of other projects to improve the prediction efficiency.Firstly,the software defect prediction method based on improved ladder network(ILR-SDP)is proposed.This method introduces the semi-supervised ladder network model,improves the denoising decoding function,and constructs the prediction model with unlabeled and labeled data.The prediction accuracy of the model is greatly improved.Secondly,the class imbalance problem is taken into consideration.Unfortunately,the imbalanced nature of software defect datasets increases the learning difficulty for the predictors.In this paper,cost-sensitive learning is introduced and the supervised part of the deep ladder network is added into the cost penalty term(CILR-SDP).We utilize the different misclassification costs for defective and defect-free classes to alleviate the class imbalance problem.It effectively alleviates the negative impact of class imbalance.Finally,a method termed geodesic flow kernel software defect prediction(GFK-SDP)to solve the problem of different distributions between source domain and target domain is proposed in this paper.The source and target data sets are embedded into a gaussian manifold.The geodesic flow which integrates the infinite subspace variation between the source data point and the target data point is constructed.In particular,the eigendata is projected into these subspaces to form an infinite dimensional eigenvector.The inner product between these eigenvectors defines a kernel function that can be computed on the closed form original eigenspace.The kernel encapsulates the incremental changes between the differences and commonalities between the two domains.At this point,the subspace is the two approximate distribution spaces formed by the transformation of source data and target data.Finally,the traditional software defect classifier within the project is used to predict the label.Compared with the classical semi-supervised within-project prediction method and the crossproject defect prediction method,our methods achieve better prediction effect on NASA,AEEEM and ReLink data sets.
Keywords/Search Tags:software defect prediction, ladder network, cost sensitive, geodesic, transfer learning
PDF Full Text Request
Related items