Empirical Study On Community-based Crowdsourced Contribution Quality And Collaboration Mechanism

Posted on:2020-11-08

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Lu

Full Text:PDF

GTID:1368330611993010

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,more and more online services use the crowdsourced methods to collect the wisdom and power of the crowd on the Internet in the form of community,and create a large number of high-quality artifacts,among which the crowdsourced development community and the crowdsourced Q&A community are the most typical.In these two types of communities,participants from all over the world collaborate to contribute code,knowledge and experience,and produce high-quality open source software(OSS)libraries and knowledge bases,providing important resources for software reuse and problem solving.While providing powerful productivity,the differences among individuals,open management model,and loose organizational structure pose challenges to the management and assurance of contribution quality.How to effectively ensure the quality of the crowd's contributions has become an important research topic.Based on the big data accumulated in the crowdsourced development community,GitHub,and the Q&A community Stack Overflow(SO),this thesis takes the quality of crowdsourced contribution as the starting point,analyzes the crowd synergy mechanism,studies the best practices of crowdsourced contributing,and explores the mechanism design of the crowdsourced communities.This thesis has achieved the following innovative results:1.For the internal code quality of casual contributors in GitHub,an internal code quality measurement method based on static code quality issues is proposed,the rules of internal code quality of developers in different roles are discovered,and the empirical evidence of the necessity of the usage of the continuous inspection method as an internal quality assurance paradigm in the crowdsourced development community is provided.First,the thesis defines the concept of casual contributor and proposes the code quality issue as a measure of internal quality of code.The thesis selects 21 most popular OSS projects in GitHub and scans the entire version history of each project with the popular static analysis tool,SonarQube,to obtain the code quality issue density introduced by each developer to a specific project.Further,the thesis quantitatively analyses the difference in the contribution quality of casual contributors and main contributors,as well as the difference in the contribution quality of the same developer when she plays different roles contributing different projects.In addition,this thesis conducts an online survey of 81 developers in the community to understand their perceptions and encountered challenges in internal quality assurance and management.The thesis has the following major conclusions:(1)the internal quality of the code introduced by casual contributors is significantly lower than that of main contributors;(2)when developers play different roles to contribute different projects,the introduced internal quality of code is not significantly different;(3)most core developers and casual contributors value the internal quality of code,while different core developers focus on different aspects of internal quality and tend to manually check internal quality;(4)obtaining a comprehensive understanding of the project is a challenge for casual contributors,and various internal quality requirements for different projects puzzle them.Based on the research findings,the thesis discusses the necessity of the continuous inspection method as an internal quality assurance paradigm for the crowdsourced development community.2.For the gamification-influenced developers' participation motivations and contributions in SO,a gamification-influenced participation motivation model is proposed,the inherent rules of developers'characteristics,participation motivations,and contribution outcomes are discovered,which provide guidance and method support for the mechanism design and community building in crowdsourced Q&A communities.First,the thesis refers to the developer motivation in the crowdsourced development community and draws on the self-determination theory in psychology.The thesis conducts a first-round survey with 282 developers to determine the participation motivation in SO.Based on the survey results,the thesis then conducts a second-round main survey of 656 respondents in SO to verify the integrity of the discovered motivations.Further,by connecting the data of the second-round survey and the SO platform data,the thesis models the developers' motivations and contributions using the partial least square regression method,analyzes the relationship between the individual characteristics and the motivation of participation,and studies how diverse developer motivations affect the quantity and quality of their contributions.The thesis has the following major conclusions:(1)in spite of the presence of the gamified incentives,the main motivation of developers to contribute in SO is intrinsic motivation;(2)developers' external,integrated and identified motivations are significantly negative with their development experience,and developers with more development experience are less motivated by gamified incentives;(3)the extrinsic motivation regarding career prospects and the intrinsic motivations regarding helping others and self-improvement are significantly positive correlated with the quantity and quality of contributions,in which the incentive effect of extrinsic motivations are approximately twice the intrinsic motivations;(4)developers who often participate in the OSS communities have more perseverance to solve difficult problems and provide high-quality questions and answers;(5)the satisfaction of needs for ability and autonomy has a positive impact on the quality and quantity of developer contributions and the input for solving difficult problems.3.For the gamification-influenced fast answer(FA)phenomenon and its influence on contribution quality in SO,the rules of development of the FA phenomenon and developers' practices regarding FAs are discovered,the asker-oriented and the crowdoriented answer quality assessment models are proposed,which provide practical advice for the mechanism design of crowdsourced Q&A communities as well as for developers' best practices.First,the thesis collects the 10-years data of SO from its launch in 2008,and quantitatively analyzes the popularity of the FA phenomenon and developers' practices regarding FAs.Further,the thesis conducts statistical analysis and regression modeling on the quality indicators of FAs,and analyzes the relationship between FAs and crowdoriented and asker-oriented quality assessments.Based on the results of quantitative analysis,the thesis finally conducts a qualitative analysis of 300 FA instances to understand the relationship between FAs and the problem solutions.The thesis has the following major conclusions:(1)developers who actively post FAs only account for a small proportion of all developers,but they contribute more than half of the answers in SO;(2)the quality of FAs is generally low:the length and readability of FAs are significantly lower than non-FAs;although the crowd-oriented quality assessment of FAs is higher than non-FAs,they have no significant relationship with the asker acceptance;(3)most FAs provide so-lutions to problems,but 14%of the FAs solve the problems by interacting with the askers in comments.4.For the crowdsourced learning(CL)phenomenon in GitHub,the rules of the development trend of the CL phenomenon and the learners' CL practices are discovered,a CL model is proposed,and a supporting platform,LearnerHub,is developed,which provide practical guidance and platform support for the platform design and practice.First,the thesis uses regular expressions to extract the learning projects on GitHub from the GHTorrent dataset,and analyzes the development trends of these learning projects.The thesis selects 105 the most popular learning projects and analyzed the CL activities of the learners in these projects.Further,in order to understand the practitioners' CL behaviors and their benefits and challenges,the thesis conducts an online survey of 40 core learners and 261 external learners who participated in these popular learning projects.The thesis has the following major conclusions:(1)the CL phenomenon in GitHub is becoming more and more popular,and the ratio of annual growth of learning projects to all projects is steadily increasing.These learning projects attract more participants,and the number of watchers exhibits an exponential growth over several months;(2)the learning project show different characteristics from those of the OSS projects,e.g.,the learning projects have very few long-term contributors;(3)the learners benefit from the contributions from the community,high-quality learning content,and personalized learning;(4)the learners face challenges in platform support,quality assurance,and maintaining motives of learning;(5)the thesis proposes the conceptual model and the learner behavior model of CL,and(6)develops the CL platform,LearnerHub.

Keywords/Search Tags:

Crowdsourced Development, Question and Answer Community, Contribution Quality, Casual Contributor, Gamification, Incentive Mechanism, Participation Motivation, Fast Answer, Crowdsourced Learning

PDF Full Text Request

Related items

1	Research On Question-type Sensitive Answer Summarization In Community Question Answering
2	Finding Experts In Community Question Answering
3	Incentive Mechanism Design For Crowdsourced WIFI Community Network
4	Research On Differentially Private Mechanisms In The Utilization Of Crowdsourced Preference Data
5	Research On The Evaluation Of Answer Quality In Q&A Communities Based On Multiple Models
6	Answer Selection For Non-factoid Question
7	Mutual Promotion Of Question Retrieval And Answer Ranking In Community Question Answering
8	Research And Application Of Question Classification And Answer Evaluation In Community Question Answering System
9	Shared Economic Perspecrive To Pay Question And Answer Platform Of The Nerwoek Transmission Mechanism Researach
10	Candidate Answer Sentences Selection Based On Deep Learning