Font Size: a A A

Open Source License Selection For Maven Repository

Posted on:2022-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2518306530998309Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of open source communities and the popularity of Free and Open Source Software,the reusability of open source components are becoming more and more prevalent.Open source licenses pose influence on the protection of the rights and obligations of developer's work,the development process and the application of open source projects.This thesis is supported by National Key Research and Development Plan of China(Cloud Computing and Big Data Special Project)to study the ecosystem and license mechanism of open source licenses.In view of the kinds and complex terms of open source licenses,we take the concentration on the selection of open source licenses for Maven package repository.The research results take advantages of promoting the compliant use of software,and realizing the reliable supply of open source software.To this end,a study of license selection on open source component level has a great significance on implementing the national development strategy of open source software.In software engineering,the current works focused on the empirical study of license selection,the analytics of license compatibilities,and the development of license selection tools.Researchers analyze the usage,trends,and the features of license selection in development communities such as Github and Source Forge,and provide developers with empirical study of license selection.A small amount of works focused on the selection of open source licenses for package managers(i.e.,NPM,Ruby Gems,CRAN,Py PI).As one of the largest and oldest software repositories,Maven repository provides a large amount of mature and reusable open source projects.But it is lack of an empirical analysis of license selection for Maven repository.When the current development method of the reusability of open source components has becoming a trend,a key factor –software dependencies—was ignored in previous work.Software dependencies can impose restrictions when choosing license for new software systems.We take the open source projects hosted on Maven repository as the research object,and analyze the features and trends of licenses selection,and the influences on the projects' development speed by quantitative analysis.A license selection model is proposed in this thesis based on the development dependencies of the Maven ecosystem.The main works of this thesis are as follows:(1)The selection of licenses has different features and trends in different programming communities and repositories.We collect projects hosted on Maven repository as well as to realize a data model including relationships among projects,versions and dependencies.On the basis of this dataset,we try to answer three questions by quantitative analysis.The following three questions guide our research:(1)Which licenses are commonly used in Maven repository?(2)What is the usage of open source licenses in the Maven package repository?(3)Does the choice of different types of open source licenses affect the development speed of open source projects? The following section presents our data processing before answering the above research questions: 1.Counting the license distributions of projects hosted on Maven repository,and developed by Java,Scala,Java Script,Shell,Kotlin programing language from 2009 to 2018.2.Calculating the development speed of the most popular open source projects.3.Grouping open source licenses according to OSI,Copyleft,Permissive,SPDX,Multiple,and Other.From the perspective of package manager repository,we provide developers with empirical study in license selection.(2)A key factor – software dependencies – was ignored in previous work.An open source license selection model is proposed in this thesis.How license selection can be affected by software dependencies and license compatibilities is analyzed.In the proposed model,the open source project dependencies data are used to construct a dependencies network and to calculate similarities between two projects.A process utilizes a modified license compatibilities graph and user project's dependencies to detect license violations.We evaluate the license selection model using test set selected from Libraries.io randomly.The results demonstrate the validity of this model in selecting the applicable licenses.
Keywords/Search Tags:Open source license, Component reusing, License selection, Software dependencies, License compatibilities
PDF Full Text Request
Related items