Font Size: a A A

Deep Learning-Based Automatic Issue Classification For Open Source Community

Posted on:2021-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhuFull Text:PDF
GTID:2428330647950885Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of open-source-software movement,more and more users and developers are using Issue Tracking Systems(ITSs)to report issues(or issue re-ports),including bugs,feature requests,enhancement suggestions,etc.User feedback,contained in issue reports,is very useful,which can not only help improve software quality,but also facilitate requirement collection.However,the amount of issues is huge:a single project can involve tens of thousands of issues.Also,the quality of issue reports is not very high.Therefore,it will cost developers a lot to manually analyze issues,let alone focus their attention on bug-related issue reports which obviously need to be dealt with urgently.So,we are researching on automatic issue report analysis.Specifically,we aim at identifying issues that are related to software defects so that developers can solve them and repair defects at once.An issue commonly contains information such as title,description,submitter and priority,where title and description are textual property.Issues in different ITSs have different format and property,which present challenges to issue identification and anal-ysis tasks.Current researches seldom focus on automatic issue analysis,and existing issue classification approaches can hardly handle cross-platform and accurate defect identification task.In this paper,we propose a deep learning-based,two-stage approach to automat-ically identify bug-reporting issues across various ITSs.First,our approach trained an attention-based bi-directional long short-term memory(ABLSTM)network using a dataset of about 1.5 million labelled issues to identify bug reports.Then,we extracted information from issues' descriptions and neighbors.Together with the prediction from first-stage classifier,features from descriptions and neighbors were further input into a Support Vector Machine(SVM)to correct results from the deep learning classifier.Due to the fact that the information we used in our model is shared across multiple ITS s,our model can support different mainstream ITSs in the market.Experimental evaluation shows that our approach achieved an average F-score of 0.866 in distinguishing bugs and other issues,significantly outperforming other bench-marks and state-of-the-art approaches.Besides,experiments have verified the effec-tiveness of our first and second stage classifiers respectively.Our model is further proved to work well in different ITSs.
Keywords/Search Tags:Issue tracking system, Issue report, Issue classification, Recurrent neural network, Open source community
PDF Full Text Request
Related items