Font Size: a A A

Research On The Inherent Mechanisms And Approaches Of Efficient Aggregation Of Crowd Contribution For Open Source Development Ecosystem

Posted on:2020-06-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1368330611993111Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The open source development paradigm has brought huge changes in the software development process.Developers located in different locations around the world are free to participate in the software development and interact,collaborate through the Internet environment.Developers can participate in the development of multiple projects at the same time,or use different development tools,services,and platforms in the development process.The contribution to a project is no longer limited to the code,but also in discussion,fixing,management,testing,deployment,and other aspects of development.Nowadays,the deep integration of the crowd-based open source innovation paradigm and the enterprise-level software production technology makes the people,products,data,and other elements in the open source development paradigm closely interact with the collaborative environment.Thus,the open source software has evolved to form an open source development ecosystem.Among them,the active involvement and sustained contribution of large-scale crowds are the key factors and important driving forces for the continuous growth of open source software.However,the continuous influx of crowd contribution is huge,highly fragmented and diverse,and there are complex and diverse information exchange relationships among various development resources in the process of contribution management,which makes it difficult for traditional contribution processing and publishing methods to meet demand,and thus seriously affects the aggregation efficiency of crowd contribution.Therefore,exploring highly efficient integration mechanisms for the development resources,improving the efficiency of crowd collaboration,management and maintenance,and resource reuse,and building a continuously evolving collaborative environment,have become urgent needs for the efficient aggregation of the crowd contribution in the open source development ecosystem.This paper focuses on the efficiency problem of the crowd contribution aggregation in the open source development ecosystem.Based on the big data of software development accumulated by the open source community,the data-driven empirical analysis and method research are systematically carried out from the four aspects: development task,stakeholder,development information,and collaborative environment,and the intrinsic mechanisms and best practices during the highly efficient aggregation of crowd contribution in the open source development ecosystem are deeply analyzed.The main work and contributions are summarized as follows:Firstly,for the impact of development task management on the efficiency of crowd contribution processing during contribution planning,the usage pattern of the open source management model using milestone tool is discovered,and the multi-dimensional models of milestone open source management benefits are proposed,which provide practical advices for the management of crowd intelligence and the improvement of contribution processing efficiency in the open source development ecosystem.First,we use quantitative methods to summarize the basic characteristics of usage of milestone tool in the task management,such as processing time cost,and the relationship with project characteristics,and use a survey to analyze the motivations and unmet needs of actual developers when using the milestone tool.Further,we use the mixed-effect linear regression analysis to construct the contribution resolution time model,the code productivity model,the release quantity model,and the project popularity model.The effects of different factors such as project age,team size,milestone usage frequency,and settings details,are quantitatively analyzed from multiple dimensions.We discover and summarize the best practices in each dimension.Secondly,for the influence of stakeholder interaction on the efficiency of crowd contribution processing during contribution reviewing,the interaction rules when internal and external contributors using social tools are discovered,and the socialized collaborative network modeling method based on interaction data and the measurement method of developers' socialized collaborative capability are proposed,which provide theoretical guidance and method support for the rapid contribution reviewing and the efficient collaboration of the contributor groups in the open source development ecosystem.First,we analyze the usage location,scenarios,and stakeholder patterns of social tools used in the contribution reviewing process from a micro perspective,and quantitatively analyze the impact of social tool usage on contribution processing efficiency.Further,based on the large-scale stakeholders' interaction data,we construct a socialized collaborative network among stakeholders.From the macro perspective,we analyze the network attributes,the stability,and the evolution characteristics of the stakeholder collaboration network,and propose a developer socialized collaboration metrics based on PageRank for identifying the influential contributors among project stakeholders.Thirdly,for the impact of development information linking on the efficiency of contribution processing during contribution sharing,the internal and external development information linking rules are found,and an embedding-based development information linking method is proposed,which provide practical advices and method support for improving the efficiency of contribution processing and linking of development information resources in the open source development ecosystem.First,we use the qualitative method to summarize the different patterns of linking practices when stakeholders sharing development information in the process of contribution processing,and then quantitatively analyze the frequency,evolution,and other characteristics of the development information linking patterns.Then we use multiple linear regression analysis techniques to build the contribution resolution latency model and the contribution discussion length model,quantitatively exploring the impact of different link factors on the contribution resolution process.Further,based on information retrieval technology and embedding models,we propose a hybrid development information linking method,for automatically recommending relevant internal development information within the core project.Fourthly,for the impact of collaborative environment construction on the efficiency of contribution publishing during the contribution testing and deployment,two continuous deployment workflows are discovered: auto-builds based and continuous integration services based,and a clustering-based method for modeling continuous deployment configuration file evolutionary trajectories is proposed,which provide effective support for the efficient integration of various automation tools,services,and platform resources,and help improve the efficiency of contribution publishing in the open source development ecosystem.First,we use a survey to analyze the motivations,construction methods,unmet needs,and other experiences of developers when constructing the continuous deployment workflow,and put forward a number of assumptions that may affect the efficiency and quality of the contribution publishing,and validate them by conducting the multiple regression analysis of the four dimensions: release frequency,build results,configuration stability,and build latency.We summarize the actual differences and trade-offs between different continuous deployment workflows.Furthermore,we explore the evolution model of the configuration file of the continuous deployment workflow,and propose a configuration file evolutionary trajectory model based on clustering algorithm,and use multiple regression analysis to study the impact of configuration details and different evolutionary patterns on the publishing efficiency and configuration file quality.In summary,for the planning,review,sharing,testing,and deployment process of crowd contribution,this paper studies the internal mechanisms and key methods of efficient aggregation of crowd contribution in the open source development ecosystem.It has important theoretical guiding values for the analysis of open source ecosystem and the crowd software development in the Internet age.It also has important practical application values for software ecosystem modeling,development resource integration,and development configuration optimization.
Keywords/Search Tags:Open Source Development Ecosystem, Social Coding, Continuous Integration, Continuous Deployment, Resource Integration, Crowd Collaboration, Task Management, Knowledge Reuse, Continuously Evolving
PDF Full Text Request
Related items