Font Size: a A A

Relatedentityfinding And Homepage Finding

Posted on:2014-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:W Y ZhouFull Text:PDF
GTID:2248330398470981Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
REF (Related Entity Finding) is the TREC (Text Retrieval Conference) physical retrieval is a promising research topic. REF requirement is that the topic information, extracted via the Internet and related database that corresponds with the topic of the relevant entities of the answers and the corresponding entities Home. The status quo at home and abroad, and some cutting-edge algorithms, calculated from the extraction and expansion of key words, text retrieval, paragraph segmentation and correlation, named entity recognition, entity sorting and supporting documentation to find, etc. the implementation process of research and analysis, mainly to complete the work of the following aspects:(1) For the entire page text improved approach for short text paragraph, which removed a lot of text content, reducing the size of the returned text to improve the system processing efficiency.(2) According to Wikipedia’s structural features, the use of synonyms and hypernyms in Wikipedia is built based on the Wikipedia category dictionary, and for entity extraction part, adapted to the entity type of the REF project this year, and fine features, while improving the entity extraction the accuracy of.(3) Add the word density-based algorithm, the proofing of the DCM model results, and achieved fairly good results.According to the answer to last year’s model of DCM Documentation Center in the calculation formula parameters adjusted, the model has been improved.
Keywords/Search Tags:Trec, Ref, Text Search, Correlation, Related EntityExtraction, Stanford Tools, Wikipedia
PDF Full Text Request
Related items