Font Size: a A A

Research On Cross-project Knowledge Reuse Technology For Defect Management

Posted on:2019-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y R ZengFull Text:PDF
GTID:2428330611993646Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Software defect management has always been a very important part in software de-velopment activity.Nowadays,with the popularity of open source,more and more de-velopers prefer to host their software projects on social coding platforms.In order to attract external contributors from different regions to participate in the development of the project,the open source community has adopted a set of lightweight management tools.On the one hand,these deformalized contribution mechanisms can make a collab-orative project easy to collect more contributions from a wider range of the community,helping to promote continuous iterative optimization of the project.On the other hand,the increasing number of arbitrary,half-baked or undesirable contributions flow into the project along with high-quality ones,which bring huge hidden dangers to the healthy of open source projects.As a result,automated defect management is becoming more and more important within the collaborative open source ecosystemHowever,existing automated management methods are based on traditional machine learning models and are subject to the number of training samples.For projects of new or with insufficient historical data,building a good training model is a challenging issue as collecting new data and labeling them is cost-expensive.Therefore,in order to solve the defect management problem of the sample-deficient project,we conduct an extensive empirical study of the defect-oriented cross-project knowledge reuse technology based on the large-scale data set in Github.The main contributions are summarized as follows:First,based on the data resources accumulated by the open source community,we separately measure the quality within the project and the relationship between the projects,and propose a multi-dimensional quantitative measurement system for open source projects The quality evaluation measures of within project including 4 dimensions,9 metrics,and the cross-project correlation measures including 3 dimensions and 15 metrics.In addition,we propose a project quality evaluation method based on the multi-dimensional quality metrics,and verify the effectiveness of this method through experimentsSecondly,based on the multi-dimensional correlation measurement between projects,we analyze the transferability of cross-project under the scenario of issue report classifi-cation and defect prediction.Through the empirical analysis of high-quality project sets in Github,we find that:in the issue report classification scenario,the intersection size of participants between projects has the greatest impact on cross-project transferability,the more co-participants of the two projects,the better the effect of cross-project classifica-tion;in the defect prediction scenario,the self-prediction effect of the source model has the greatest impact on cross-project transferability.The higher the prediction accuracy of the source project model within the project,the better the cross-project prediction effect.Thirdly,we propose a novel cross-project approach which integrate multiple models learned from various source projects to classify target project.Based on the large-scale data set in Github,we evaluate our approach through conducting comparative experiments under the scenario of issue report classification and defect prediction,and find that our approach can achieve great prediction results.
Keywords/Search Tags:Cross-project Prediction, Defect Management, Issue Re-port Classification, Defect Prediction
PDF Full Text Request
Related items