Font Size: a A A

Influence Analysis Of Multi-label Classification On Community Odor Co-occurrence

Posted on:2022-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:D GuoFull Text:PDF
GTID:2518306749483334Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Community smell is a poor organizational structure and community technical problems in the software development community,and these characteristics lead to a reduction in the cohesion of communication and cooperation among members of the software development community,causing additional development and maintenance costs to the software system.At the same time,it has a negative impact on the maintainability,understandability and testability of the software system.The community smell widely exists in the open source software community,and can be evaluated and predicted by a series of community technical indicators.However,these community technical indicators are quantified for a single community smell,and do not consider the complex situation of the co-occurrence of multiple community smells.Aiming at the complex problem of community smell co-occurrence,this paper proposes a multi-label classification(MLC)algorithm model based on machine learning to quantify and analyze the impact of community technical indicators on the co-occurrence of multiple community smells.Based on the multi-label algorithm label set,the model compares four basic machine learning classification algorithms to classify and predict the co-occurrence and intensity of multiple community smells.The experiment focused on 243 project versions in 6 open source communities,focused on18 community technical indicators and 4 community smells,and studied the cooccurrence and intensity changes of multiple community smells in the process of community project version changes.And the performance of the model is evaluated by multi-label evaluation indicators.The results show that the model based on decision tree has the best prediction performance for the co-occurrence and intensity of smells in multiple communities,both reaching more than 80%.This paper uses the Information Gain algorithm and the shap interpretive tool to analyze the influence of the model of community technical indicators predicting the cooccurrence and intensity of community smells,including the feature contribution value obtained by calculating the entropy minus change of the feature and the calculated feature.The shap-value obtained by the marginal revenue.And through the feature contribution value,the impact of each community technical indicator on the cooccurrence of community smells can be analyzed,and the contribution results can be ranked using the SK-ESD check.The shap-value analyze the positive and negative correlations of each community technical index to the predicted results,and give the contribution map of marginal revenue from large to small.Community-technical consistency,which ranks first in both contribution value and shap-value,confirms the importance of this feature on the impact of community smell co-occurrence.This paper explores the feature with the highest contribution value and interpretability to community smell co-occurrence--community technical consistency,explores its trend with changes in project versions,and analyzes the community technical consistency and the future success of the community.association.Through the correlation analysis,it is found that the two are strongly related,and with the change of version,the community with stable and improved community technology consistency has a higher probability of project success in the future,on the contrary,the community technology consistency fluctuates more The larger the community,the higher the failure rate of the project in the future.Through the correlation analysis of community technology consistency on community smell,this paper analyzes and verifies the threshold value of community technology consistency on predicting the success of the project.When the community technology consistency reaches 0.5 or more,the project is often successful,and the result is passed.The effectiveness of the method is confirmed.
Keywords/Search Tags:community smells, multi-label, co-occurrence, socio-technology congruence, information gain
PDF Full Text Request
Related items