Font Size: a A A

Crowdsourcing For Synonyms Proofreading And Acquisition In Chinese Large-scale Semantic Knowledge Base

Posted on:2015-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:H H WangFull Text:PDF
GTID:2298330431493880Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In Chinese natural language processing, Large-scale Chinese semanticknowledge base was indispensable basic resources. It had been widely used in theareas of information retrieval, automatic summarization, question answering system,etc. But, it often required a large manpower and time overhead to establish large-scaleChinese semantic knowledge base by traditional methods. In order to improve theefficiency of establishing and explore new ways to build resources, the task in theestablishing large-scale Chinese semantic knowledge base which was aboutSynonyms proofreading and access was designed to Crowdsourcing task. Thenimproved the design of crowdsourcing tasks and verified the reasonableness of thedesign by a series of experiments. Moreover, we tried to evaluata the results of thesynonyms Crowdsourcing experiment by Comparing with other dictionariesresources and computing word semantic similarity.First of all, this paper designed a dictionary resources integration tools, tointegrate the currently collected valuable dictionary resources through automaticintegration and manual proofreading. Large-scale Chinese semantic knowledge basewas initially formed.Secondly, as the information of large scale Chinese semantic knowledge basewas integrity. We designed the job which was about Synonyms proofreading andaccess to Crowdsourcing task with a quality control mechanism. The designing ofSynonyms Crowdsourcing would be perfected, the quality control would be verifiedand the result of Synonyms Crowdsourcing would be collected by experiment.Then, this paper examined the results of synonym Crowdsourcing experiment bytwo methods. Firstly, In order to analyze the advantages and disadvantages of theresults of synonyms Crowdsourcing experiment, we compared the results ofsynonyms Crowdsourcing experiment with ‘Synonym cilin’ which was compiled bylinguistic experts. Secondly, we also calculated the word similarity of synonyms Crowdsourcing experimental results through the similarity algorithm which wasbased on ‘HowNet’ to quantify the quality of synonyms Crowdsourcing.Finally, the paper summarized the content of this study. And the optimizationscheme was put forward further according to the existing results of synonymCrowdsourcing experiment. We also tried to deploy the Crowdsourcing task to a moreopen platform for testing and explored the application foreground of Crowdsourcingtechnology in the construction of Chinese semantic knowledge base.
Keywords/Search Tags:Natural Language Processing, Construction of Semantic knowledgebase, Crowdsourcing, Synonym discrimination, Semantic similarity
PDF Full Text Request
Related items