Font Size: a A A

The Design And Implementation Of Entity Information Discovery Methods Based On Cooperative Search

Posted on:2015-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:C C SunFull Text:PDF
GTID:2348330509960652Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, the Internet is full of constant produced massive data. It brings great convenience to people's daily life but raises a problem that it's easy to get lost in the vast data sea. It is difficult to find out the required information in limited time. Luckily invention of the search engine provides an approach to obtain the information needed exactly and rapidly. In order to improve efficiency of the search engine and get more target information, it's necessary to collaborate and identify entities of massive data. Thus, the entity information discovery and identification become one of the most important research subjects in the field of information access.Based on previous studies, this paper further researched the entity information discovery in the attribute-oriented collaborative iterative search system. Through the use of named entity recognition technology, iterative search methods, text similarity calculation methods and other strategies, the entity information is discovered after processing of retrieved Web pages under given search criteria. The main work is as follows:(1) We proposed a method which takes semantic distance as rules of the meaning in order to construction a domain lexicon. Then the lexicon was added to Chinese word segmentation system as a user dictionary. Effectively reduce the proper nouns are wrong point, ensure the precision of the entity information recognition.(2) We proposed an entity information discovery method based on iterative searches. The methods including brute-force method, the association rules and manual intervention are combined to generate an iterative search criteria which improves the precision entity information discovery much.(3) An entity information discovery method which combines entity information matching, iterative search, text similarity calculation and manual intervention is proposed in this paper. The analysis of experimental results between "Xue Manzi" and "IBM Company" task demonstrates that our novel method achieved higher precision and recall rate.
Keywords/Search Tags:Cooperative Search, Web Data Mining, the Entity Information Discovery, iterative search
PDF Full Text Request
Related items