Font Size: a A A

Design And Implementation Of Data Annotation Crowdsourcing Platform System

Posted on:2021-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Z WuFull Text:PDF
GTID:2428330614971528Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet age,machine learning and deep learning have made great progress.Data has always been the core of the Internet age.But a lot of raw data cannot reflect its value.Manual data annotation has become an important work.At present,there are more mature data annotation crowdsourcing platforms at home and abroad.Compared with domestic platforms,the development of foreign data annotation platforms is more mature.However,due to the problems of network connection and language,there are fewer domestic users,so it is impossible to use foreign platforms for frequent data annotation tasks.However,the quality of domestic annotation platforms is uneven.And the all of them are good and bad mixed.All of them lack of tagging personnel with linguistic background.Therefore,we need a data annotation crowdsourcing platform for users with a certain linguistic background,with perfect functions and convenient to publish,accept and complete annotation tasks.The users of data annotation crowdsourcing platform are divided into two categories: one is the consumer,that is,the task publisher,and the other is the contributor,the task receiver.The task publisher selects the task template,after the task is published,he can monitor the task and manage the task in batch;the contributor gains growth points by completing the task,and the acceptable task level increases with the increase of the level.The project will be developed based on Java language to complete the design and development of functions,such as single sign on and authentication,task management,order management,task search,user management,quality control,etc.Consumers will input the original data,and get the labeled data and pay a certain amount of remuneration after being labeled by contributors.I am mainly responsible for the design and development of login authentication service,task management service,order management service,task search service,user management service and quality control service.The R & D is mainly based on the Spring Cloud framework,Spring Boot framework,combined with My Batis to build a basic framework,using My SQL,Redis,Elastic Search as data storage tools,and designed on the principle of high performance,high stability,and high scalability.The system development is completed by combining with many current excellent frameworks.At present,the project has been put into the daily use of the company,its value is to bring convenience to the data collection and data annotation of the company,and improve the efficiency and quality of annotation data.More high-quality manual annotation data has been put into the linguistics related research,bringing more academic achievements.
Keywords/Search Tags:Data Annotation, Crowdsourcing Platform, Elastic Search Framework, Single Sign On, Distributed Architecture
PDF Full Text Request
Related items