Font Size: a A A

Research On Vertical Crowdsourcing System For The Domain Knowledge Base Construction

Posted on:2018-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:2348330518475628Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of science and technology, domain literature and terminology dictionaries keep accumulating, and transforming unstructured domain literature into structured knowledge base is increasingly becoming the main research issue of knowledge engineering. However, at present, the construction of specific domain knowledge base often lacks the corresponding information extraction and labeling tools. And the existing manual extraction and automatic extraction schemes are faced with high labor costs or poor results, which cannot meet the needs of knowledge base construction. Therefore a vertical crowdsourcing system which combines group processing and automatic extraction is of great significance for the construction of domain knowledge base.This article mainly focuses on the following two aspects, one is how to convert the existing literature and terminology information into the knowledge tuples, and the other is how to use the domain literature to fill and enrich the knowledge base incrementally. In order to achieve the above goals, we design and develop a vertical crowdsourcing system for domain knowledge construction. The system receives field terminology dictionaries and docunents as input, while supporting multiple people involved in building domain knowledge base. The main contributions are as follows:1) In order to obtain enough knowledge documents, we implement a simple crawler framework for automatically crawling of domain documents, and crawls all relevant InfoBox information from Baidu Encyclopedia based on existing domain dictionaries.2) In order to extract domain tuple information, we propose a multi-strategy scheme for tuple information extraction based on HIT's LTP tools.3) In order to obtain the user's collaborative data efficiently,we propose acrowdsourcing task scheduling scheme based on the macro task according to the characteristics of the domain document, and optimize the quality control strategy by using the spectral method.4) We design a vertical crowdsourcing system for the geological science and technology's knowledge base construction. It can provide functions of document information management, online collaborative editing, crowdsourcing task management and online knowledge base retrieval.
Keywords/Search Tags:Domain Knowledge Base, Vertical Crowdsourcing, System Design, Task Scheduling, Crowdsourcing Quality Control Strategy
PDF Full Text Request
Related items