Font Size: a A A

Research On Socio-Technical Congruence Metrics For Open Source Software Quality

Posted on:2020-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Q ZhangFull Text:PDF
GTID:1368330578963095Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Coordination activities among developers are important in sottware development.As they are complicated and difficult to capture,there is limited research in this area.Cataldo et al.proposed a framework to measure the degree how the coordination requirements are satisfied by coordination activities.This metric is called Socio-Technical Congruence(STC).They conducted an experiment on commercial software projects and f-ound that STC was significantly related to software quality.However,there are very few studies on STC in Open Source Software(OSS)projects.OSS and commercial software projects have dif-ferent software process and remain different development data in software repositories,so the previous method of computing STC needs some necessary adjustements in OSS.The adjustements include how to establish links between a pair of files,between a pair of developers,and between a file and a developer,and how to compute weighted STC.It needs to be verified by experiment(?)hether the final STC measure is still related to software quality in OSS,and different types of STC also need to be compared when explaining and predicting software failures.Due to the above reasons,this paper includes three parts:the basic method of measuring file-level STC in OSS;file-level weighted STC in OSS;build-level STC in the context of continuous defect prediction in OSS.The paper studies how to measure file-level STC in OSS,and its relationships with software failures.Compared with the previous method,my method is based on every file instead of every development task.The method first collects project data from source code,commit log,and bug reports;then file networks?developer networks,and the links between files and developers are established based on the collected data;then coordination requirements and coordination activites are computed based on the above links;at last the STC measure is computed.Moreover,a transformed variant of STC is proposed in this paper,called Missing Developer Links(MDL),which is to measure the amount of coordination breakdowns.The experimental data is from 9 development releases in 2 OSS projects.The data analysis compares 18 types of STC metrics by analyzing the correlation between file-level STC and the amount of software failures.The 18 types of STC metrics are combined by 3 types of file networks(syntax dependency,logical dependency,and the combination of the above two),3 types of developer networks(commit overlap,comment interest,and the combination of the above two),and 2 types of STC(STC and MDL).The result shows that STC is still significantly related to software failures in OSS,and the derivation of STC,i.e.MDL is more related to software failures than STC.Among all the combinations of file networks and developer networks,the STC or MDL which is derived based on logical dependency file network and commit overlap developer network is most related to software failures.Based on the basic method,the paper studies how to compute file-level weighted STC in OSS,and its impact on software failures.The computation process is similar with the above part,but just uses the best combination of file network and developer network.This part aims to compare different methods of computing weighted STC.At first,all kinds of links are assigned with weights.Then the(?)oordination requirement of each file can be computed with 5 methods:the original m(?)thod without considering weights,the method considering the number of dependent files as the weight,the method considering the weights of 3 links,the method considering the involved files,and the method considering both the weights of 3 links and the involved files.3 methods are proposed to compute the degree how coordination requirments is satisified by coordination activities:totally satified or not satisfied at all,defining satisfying percentange based on coordination frequency,and comparing normalized coordination requirements and coordination activities.As a result,there are 15 combinations of STC.The experiment builds regression and prediction models and compares the explanotary power and predictive power of the 15 types of STC.The result shows that more complicated STC tends to be more related to software failures and improve the performance of prediction models more.The best combination is the coordination requirements considering the weights of 3 links and the involved files and defining satisfying percentange based on coordination frequency.Besides file-level STC,the paper also studies the effect of build-level STC when predicting build outcomes in coutinuous integration.This part does not compute file-level STC based on the data of a development release.Instead,build-level STC is computed just in time as soon as a commit trnggers continuous integration.When a commit triggers a continuous integration,the commit history during a particular time interval before the current time is used,to compute file-level CR of each involved file.Then all these CRs are merged to be the CR of this build.At last,the build-level STC is obtained based on the CR of this build and CA extracted from the same data.The experiment uses build outcome as the dependent varnable to build binary logistic regression models on 10 GitHub projects.The result shows that STC has significant impact on build failures and can improve the performance of buld outcome prediction models.The experiment also finds that build-level MDL has more explanotary power and predictive power on build failures than build-level STC.In summary,this paper investigates different kinds of methods of computing STC in OSS,verifies the impact of STC on software quality by experiment,compare the impact of different type(?)of STC on software failures,and find out the best STC measure.The method pre(?)osed in this paper is promising to be applied in practice to help software teams identify the coordination issues in software development and improve software quality.
Keywords/Search Tags:socio-technical congruence, software quality, failure prediction, developer network, coordination breakdown, continuous integration, open source software
PDF Full Text Request
Related items