Software can not be developed all at once,and is running and evolving for a long time.During the process of software evolution,the modification to the software source code may lead to introduction of defects.The process of software evolution is actually the process of continuously introducing defects and eliminating defects.It is difficult and expensive to maintain these evolving projects.Software defect prediction for evolving projects builds defects prediction model on the defect dataset of previous versions,and then predict the potential defects in current version.It can help to allocate testing resources reasonably,and find the problems during software development,thereby improving the quality of subsequent versions.Therefore,it has important theoretical significance and practical value to study on software defect prediction for evolving projects.Researchers at domestic and abroad have done a lot of researches on software metrics,feature selection methods and the construction methods of defect prediction models in the field of software defect prediction.However,compared with traditional software defect prediction,the research on software defect prediction for evolving projects is still insufficient.In this thesis,the corresponding research methods are presented to solve the existing problems in the field of software defect prediction for evolving projects.(1)In order to explore the importance of process metrics on introducing defects and eliminating defects,we present an analysis method of the influence degree of process metrics on introducing defects and eliminating defects.We compare and analyze the influence degree of process metrics on the change of defect state in evolving projects from two aspects,including the correlation between process metrics and the change of defect state and the classification performance of process metrics to classify introduction of defects and elimination of defects.The experimental results show that Number of Distinct Committers plays an important role in the change of defect state.Some suggestions for software development and software defect prediction are also presented.(2)To solve the problems of data distribution difference and irrelevant features in the field of software defect prediction for evolving projects,we propose an approach based on instance selection and feature selection for software defect prediction for evolving projects.Instance selection relieves the problem of data distribution difference between the datasets of previous versions and the dataset of current version,and feature selection removes irrelevant features from the datasets.In the stage of instance selection,the Euclidean distance is used to measure the similarity between each instance of previous version and each instance of current version,and the instances in previous version which are close to the instances in current version are selected to form the training set.In the stage of feature selection,a feature ranking algorithm which takes the average of three feature weights as the strategy to combine three classical feature ranking methods is proposed.The feature ranking list is obtained according to the weights of features.And then the features at the top of the feature ranking list are selected,and the irrelevant features are removed.The experimental results verify the effectiveness of instance selection and feature selection on software defect prediction for evolving projects,and show that feature selection has more influence on the performance of software defect prediction for evolving projects than instance selection.(3)A defect prediction tool for evolving projects DPTEP is designed and implemented.DPTEP realizes the approach based on instance selection and feature selection for software defect prediction for evolving projects,in which the users can configure the parameters and view the prediction results. |