Software defect prediction can help developers to optimize the distribution of test resources by predicting whether a software module is defect-prone or not.Most defect prediction researches concern on within-project defect prediction which needs enough training data from the same project.However,in real software development,a project which needs defect prediction is always a new one or without any historical data.Therefore cross-project defect prediction comes to be a hot topic which uses training data from several projects and performs prediction on another one.The main research challenges in cross-project defect prediction are the variety of distribution from source project to target project and class imbalance problem among datasets.Inspired by search based software engineering,this paper proposes a search based semi-supervised ensemble learning approach S3 EL.By adjusting the ratio of distribution in training dataset,we build several Na?ve Bayes classifiers as the base learners,then use a small amount of labeled target instances and genetic algorithm to combine these base classifiers as a final prediction model.We compare S3 EL with upto-date classical cross-project defect prediction approaches(such as Burak Filter,Peters Filter,TCA+,CODEP and HYDRA)on AEEEM and Promise dataset.Final results show that S3 EL has the best prediction performance in most cases when considering F1 measure. |