Font Size: a A A

Design And Implementation Of Intelligent Text Annotation Platform For Multi-user Collaboration

Posted on:2020-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:J F WangFull Text:PDF
GTID:2428330575459714Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of artificial intelligence,natural language processing(NLP)technology has become more and more widely used in various fields.Various algorithms are used to solve NLP problems,such as machine learning and deep learning algorithms.But the current automatic unsupervised algorithm still can not achieve good results in the specific field,such as medical,e-commerce,finance,etc.Semi-supervised algorithms and supervised algorithms require labeled data,and different algorithms have different requirements for the quality and quantity of the labeled data.In order to solve the problem of data annotation in natural language processing tasks,there are mainly three ways:crowdsourcing annotation,expert annotation,and algorithm annotation.However,all three schemes have their limitations and shortcomings,and can not meet a variety of data labeling requirements.The cost of expert labeling is high,and the quality of crowdsourcing labeling and algorithm labeling is low.In response to the above problems,this paper designs a three-stage annotation framework based on the existing research.An intelligent annotation platform for multi-person collaboration is implemented based on the framework.Specially,the research content and innovations of this paper include:(1)In repsonse to the universal annotation task,we propose a three-stage annotation framework based on active learning ideas.The annotation tasks are finished by pre-labeling algorithm,common user and expert user step by step.The pre-labeling of the algorithm is used to improve the efficiency.We propose the interactive error feedback mechanism,which iteratively improves the labeling accuracy of the user.(2)We design a task assignation mechanism based on accuracy and task similarity.It can improve the matching degree between users and tasks,thus improving the accuracy and efficiency of labeling.An assignment algorithm based on task similarity and user's preference is implemented for some specific text annotation tasks.(3)Based on the above framework and algorithms,this paper builds an annotation platform based on interactive web2.0.Both the framework and the platform are modular and componentized,and many of the algorithm components can be flexibly extended to support many other kinds of annotation tasks.(4)By conducting simulation experiments and user surveys,we verified the effectiveness of the framework in improving labeling efficiency and quality.
Keywords/Search Tags:corpus annotation, human-machine cooperation, crowdsourcing quality control, active learning, text similarity
PDF Full Text Request
Related items