Font Size: a A A

Research On Software Defect Prediction Model Based On Multi-layer Feature Selection

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2518306536496584Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In software engineering,software defects are the factors that have the greatest impact on software quality.The speed and efficiency of manual code review are increasingly unable to meet the needs of software system production and development.Therefore,efficient software defect prediction has become more and more important.This paper conducts an in-depth study on the feature selection problem in software defect prediction.Using support vector machines as the classifier,a software defect prediction model is built,and multiple evaluation indicators are used to test the performance of the software defect model.The main research contents are as follows:Firstly,a comprehensive sorting and filtering feature selection algorithm based on multiple relevance indicators is proposed.In order to reasonably evaluate the relationship between features and labels,the algorithm considers three different perspectives: statistics,probability,and sample relationship.It comprehensively evaluates features based on the principle of equality,and uses dynamic automatic threshold adjustment strategies to automatically select feature subsets.The correlation degree of the label is used to reduce the dimensionality of the defect data.Secondly,this paper proposes a wrapped feature selection algorithm based on clustering.In order to reduce the redundancy between features,the algorithm measures the relationship between features through Pearson's correlation coefficient,and uses hierarchical agglomerative clustering algorithm for feature clustering.According to the idea of wrapped features selection algorithm,the forward selection strategy is adopted to gradually select the optimal features in the cluster until the model performance is no longer improved,and then the feature selection is completed.This algorithm performs feature clustering analysis based on the relationship between features,reducing the redundancy of feature subsets.Thirdly,this paper establishes a software defect prediction model based on multi-layer feature selection.Using a multi-layer feature selection method,following the strategy of "Minimal Redundancy Maximal Relevance ",using Support Vector Machines as the basic classifier to build a software defect prediction model to realize the prediction of software defects.Finally,experiments were carried out using NASA's public software defect dataset SDP to verify the effectiveness of the feature selection algorithm proposed in this paper.The software defect prediction model in this paper was compared with the models of other scholars,and analyze the experimental results.
Keywords/Search Tags:software defect prediction, feature selection algorithm, multi-layer feature selection, software defect prediction model
PDF Full Text Request
Related items