Font Size: a A A

Study And Implementation Of Multidimensional Open Source Crowdsourcing Code Annotaiton Evaluation Method

Posted on:2021-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:R M WangFull Text:PDF
GTID:2518306548494654Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a product of group collaborative innovation,the rapid development of open source software project has accumulated massive high-quality resources,providing a solid foundation for software innovation learning and innovation practice.However,the rapid iteration and overall development of the project also brings challenges to the retrieval and reuse of project resources.Most search engines and the open source community currently obtain open source code by searching for keywords,but these keywords are mostly code-based.Therefore,when developers do not know how to implement a method,they cannot search the code to get the snippet they want.If the quality of the project comments is low,At this moment,the code comments are not helpful to the user,and the user needs to spend a lot of time analyzing the code.Therefore,effectively evaluating and improving code annotation quality is an important way to improve code reuse rate,development efficiency and software retrieval efficiency.Based on the excellent open source projects in Github and code annotation,We construct an annotation quality evaluation method based on code structure and annotation semantics.In addition,we built an online code annotaiton platform CodePedia,and organized large-scale code annotition competition based on the platform.By designing thes scoring mechanism,The method of marking quality assessment was used to mark the quality assessment of the competition.The main contributions of this paper mainly include the following three aspects:The first is the annotation importance assessment method based on code structure characteristics.We propose an annotated importance assessment method based on code structure features,which relies on the good structure and semantics of the code,extracts code structure features and code semantic features from the context code of the current line of code as the main basis for evaluating annotation importance,and trains the annotated importance assessment model.The second is multidimensional crowdsourcing annotation quality evaluation method based on code ssemantic.we put forward a multidimensional crowdsourcing labeling assessment method based on readability,completeness and accuracy of code comment.The assessment of annotation accuracy depends on keywords extracted from code annotations and code syntax analysis.In the method of assessment of annotation readability,we built a N-Gram model,and created the readability of the formula based on the degree of confusion.In the method of assessment of annotation completeness,we extract keywords of each annotation type based on the law of each annotation.Third,in the aspect of application,we built a reusable code retrieval system based on group intelligence CodePedia,through rational game content design,stage design and rating of mechanism design,successfully hosted the national contest of open source code annotation based on CodePedia platform.In addition,we designed the process processing labeling based on clone detection and useless annotation screening,and integrated the labeling evaluation method with expert scoring,labeling importance evaluation and labeling quality evaluation.
Keywords/Search Tags:Code Comment, Code Comment Quality, Open Source, Crowdsourcing
PDF Full Text Request
Related items