Font Size: a A A

Prediction And Recommendation Of Cross-project Correlated Issues

Posted on:2020-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:H RenFull Text:PDF
GTID:2518305735951819Subject:Computer technology
Abstract/Summary:PDF Full Text Request
During the software maintenance process,the developer locates and repairs the issues submitted by the user when they use the software.In order to improve the software maintenance efficiency,researchers have done a lot of work on the relationship between issues,including the detection and recommendation of repeated bug reports and similar bug reports.These studies mainly employ natural language processing methods to process bug reports,and measure the text information of bug reports to establish detection and recommendation models.As the number of open source projects increases,the calling between projects become complex.Therefore,issues between different projects will also be correlated,and such issues are often referred to as cross-project correlated issues.Because crossproject correlated issues are in different projects,it is more difficult to locate and repair such issues which brings new challenges to developers.At the same time,there has been little research on cross-project correlated issues in previous work.The text information of the cross-project correlated issues is insufficient and the text difference between the issues is large,so the text-based method proposed in the previous research may be not applicable to the cross-project correlated issues.In view of the above problems,this paper mainly includes two aspects of research:prediction and recommendation of cross-project correlated issues.This article includes the following four main areas of work:(1)Collecting information from seven popular open source projects from GitHub to construct the dataset of this article,and introduces regularization matching methods to determine cross-project correlated issues based on the link information of issues.This generate a dataset for cross-project correlated issues prediction,including true cross-project correlated issues and common issues.At the same time,it is also possible to obtain the cross-project correlated issue-pairs,and use the subsampled method to construct the unrelated issue-pairs,then generate the dataset for recommendate cross-project correlated issues.(2)A new method is proposed for predicting cross-project correlated issues,performs feature extraction based on process metric of issues(including textual statistics of issues and historical activity information of issue reporters).Using these process metric features,a predictive model(PM)is built to predict cross-project correlated issues.At the same time,TF-IDF and Word Embedding are used to process the text information of issue,and two different models(TI and WE)are constructed using the text information.In addition,three kinds of hybrid models are constructed by combining process metrics and text features,i.e.,P+T,P+W,and P+T+W.(3)For recommendation of cross-project correlated issues,feature extraction is performed from a pair of cross-project correlated issues.It mainly includes similarity between the issues,cooperation between the developers of issues,and familiarity between developers and project.By these features,we build models to recommend cross-project correlated issues.And cross-project correlated issues are recommended based on text similarity between issues directly.The text similarity of issues is calculated by three methods:TF-IDF,Word Embedding and BM25.(4)Check the validity of prediction and recommendation model proposed in this paper on the dataset.Two evaluation indicators(MCC and F1)are introduced to evaluate the effect of predictive model.By comparing the results of PM model and other predictive models,PM model has significantly improved on both metrics compared to other models.For the results of recommendation model,three evaluation indicators are used:MAP,MRR,Recall-rate@k.Moreover,the experiment is divided into three different scenarios:recommendations in specific target projects,recommendations in all projects,and cross projects recommendations.The experimental results show that the recommended model based on the extracted features are better than the similarity-based recommendation methods in different scenarios.
Keywords/Search Tags:Cross-project Correlated Issues, Process Metrics, Prediction Model, Feature Extraction, Recommendation Model
PDF Full Text Request
Related items