English Entity Answer Extraction And Home Find

Posted on:2011-05-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y B Xu

Full Text:PDF

GTID:2208330332978733

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Entity answer finding, one key point of Q&A system and information extraction, is an important task of entity search. In TREC 2009, the entity task requires extracting relevant answer and entity homepage from Internet and relevant data set by utilizing entity property, type and the context of entities. Therefore, how to use the natural language information effectively to retrieve text, passage, and answer has become a core issue. This paper focused on researching and investigating the critical implementation process of English entity answer extraction, such as query expansion, passage segmentation, text and passage relevance calculation, named entity recognition, answer entity extraction, answer extraction base on table and homepage finding. The emphasis is as follow:1. Put forward a method of entity answer extraction about TREC entity task. This method considers text, passage and entity relevance related to answer. In detail, the text relevance is the similarity between the title in webpage and the query; the passage relevance indicates the similarity between sentences in paragraph and the query. The entity relevance shows the score of distribution density referred to entities and query words in the passage. The synthesized score of entity answer will be obtained through a linear combination of the above three scores. Then we extract the entity owned the highest score as the final answer. The experimental results of entity task in TREC 2009 show that the method has a good effect, and the NDCG evaluation reaches 0.30.2. Provide an extraction method of table entity answer in TREC. For low precision of entity recognition due to lack of context, we combine the title of table and elements in table to extend context of entity recognition by using features of table and label in webpage. Besides, considering probability and statistics of entity recognition in the relevant text, this paper has dealt with entity recognition of all the elements in table, combined with the score calculation method of entity answer extraction, which has achieved a better result.3. Propose an entity recognition method based on AdaBoost. A number of entities and the corresponding entity homepages have been collected manually. For entity feature, we define features related to links and Webpage contents, and these features are extracted to form the training data set. We recall homepage related to entity through the Google search. The paper has used AdaBoost method to recognize homepage, and this method shows very good result.4. Design and implement prototype system, and we have conducted the test in Entity Track of TREC 2009.

Keywords/Search Tags:

TREC Entity Track Task, Text Relevance, Passage Relevance, Entity Relevance, Entity Answer Extraction, Table Entity Answer Extraction, Homepage Recognition

PDF Full Text Request

Related items

1	The Design, Realization And Research For A Campus-Objected Entity And Social Search Engine
2	Named Entity Recognition Algorithm Based On BERT And Semantic Relevance
3	Domain-Specific Entity Linking Model Based On Entity Synonym Detection And Relevance Estimation
4	Research Of Related Entity Extraction And Homepages Finding
5	Research And Application Of Domain Oriented Entity Relationship Extraction Technology
6	Study On Related Entity Finding In Web
7	Research On Key Technologies And Application Of Entity Knowledge Extraction
8	Research And Implementation Of Question Answer System Based On Information Extraction
9	Research And Application Of CRF Named Entity And Entity Relationships Based On Recognition
10	Non-ferrous Metal Retrieval Key Technology Research Entity In The Field