Font Size: a A A

Analysis And Research On The Factors Of Pull Request Rejection In Open Source Projects

Posted on:2023-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ChengFull Text:PDF
GTID:2568306836473934Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Pull request is an essential method of code contribution on GitHub.When developers want to merge their code changes from the local machine to the central repository that stores all the source code in the project,they will submit a pull request.Developers must request permission before merging code changes into the central repository.If their source code is allowed to be merged,the pull request status is displayed as accepted.On the other hand,if merging their source code is not allowed,the pull request status will be displayed as rejected.Due to many factors,such as source code quality,developer experience,the interaction between developers and reviewers,and the duration of the pull request process,the submitted pull request may be rejected.It will take some extra effort and time for developers to fix rejected pull requests,which will affect the project cost and timeline.Based on the above problems,this paper proposes to find out the relevant influencing factors affecting the rejection of pull requests on GitHub through association rules based on data mining and find the relationship between the influencing factors.This paper analyzes the influencing factors of denial of pull requests,including the distinctive aspects of open source code and the code quality index factors of each pull request transaction,The experience characteristics of developers,and the interaction of the pull request process.These influencing factors can be added to the coding guidelines of organizations or open-source projects to become a checklist for developers to verify the source code before merging their code changes into the central repository.You can also add inspection rules to its unit test code to help developers avoid the rejection of pull requests submitted.Finally,based on the logistic regression model,this paper makes a prediction experiment on the rejection of pull requests.The input characteristics considered in the prediction experiment are the relevant factors affecting the denial of pull requests found through the association rules in data mining.It mainly includes the code characteristics of modification and change,the text characteristics of the pull request description,the contributor characteristics of developers’ previous behavior,and the interaction of the pull request process.The experiment evaluated the effectiveness of 140155 pull requests in 12 open source projects.Experiments show that the prediction results based on the logistic regression model have good performance,with an accuracy of 0.84,recall rate of 0.99,and F1 score of 0.91.The evaluation indexes have been improved compared with the baseline method.The analysis and prediction experimental results show that the influencing factors found through the association rules in data mining do influence the pull request merging results.Therefore,it can help developers focus on the primary factor issues,allocate more resources to overcome the defects of critical issues,and help developers avoid rejection of pull requests submitted,reducing project costs and timelines.
Keywords/Search Tags:Pull Request, Influencing Factors, Association Rules, Regression Model
PDF Full Text Request
Related items