Font Size: a A A

Emantic Research And Implementation On Agricultural Vertical Search Engine

Posted on:2013-03-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y M HuFull Text:PDF
GTID:1228330377451759Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
There are a large number of agricultural web sites on the internet, which covt almost all kinds of agricultural information, such as agricultural technology, supply and demand information, market information, agricultural news and policie However, agricultural information on the web doesn’t have uniform representation and thus is heterogeneous, distributed and redundant, which forms so-called isolated information islands. To address this issue, the search model for agriculture (sounong) was developed which is supported by the National Natural Science Foundation of China (Complex Adaptive Model on Agricultural Vertical Search). However, the data processing of this model is based on keywords matching of texts, in which semantics can not be fully understood.According to the characteristics of the agricultural resources on the internet, domain ontology is regarded as the semantic knowledge base and then applied to data preprocessing, index and user search of search engine, and the semantic information is added to the data processing of search engine. The improvements in data pre-processing includ the extraction of spatial properties in agriculture information, the entity resolution of geographical names and the relation extraction of price dynamics; the improvements in indexing of search engine include:the semantic annotation and semantic extension for documents; the improvements in user query include:the search strategy for regular users in the semantic environment and the semantic query extension based on user models.The main research contents are summarized as follows:1. To solve the problems of the diversity expression and implicit expression, the spatial properties in agriculture information are extracted and judged based on domain ontology and web search engine. So the semantic information is added to the process of the extraction of spatial properties.2. To solve the problem of entity resolution for agriculture geographic names, the algorithm of spatial properties extraction and Markov Logic Networks are integrated. So the semantic information is added to the process of entity resolution for agriculture geographic names.3. To solve the problem of the relation extraction of price dynamics from unstructured texts, methods of relation extraction based on Conditional Random Fields is proposed and the merging of extraction results is obtained based on domain ontology in this paper.4. According to the characteristics of agricultural web resources, methods of documents semantic annotation and core words semantic extension for search engine are proposed. Firstly the documents are semantic annotated based on the theme inference and core words extraction by combining domain ontology and syntactic analysis; on the other hand, the extracted core words are semantic extended through domain ontology.5. For user query, a double index search mechanism is proposed for the semantic annotation of documents in index based on domain ontology, at the same time, a method of query extension based on user model constructed through domain ontology is proposed for users who are registered.6. The design of Agriculture Semantic Vertical Search Engine is addressed at the end of this paper.
Keywords/Search Tags:Vertical Search engine, Name Property Extraction, Entity Resolution, Relation Extraction, Semantic Annotation, Concepts Mapping, Semantic Extension, User Model, Query Extension
PDF Full Text Request
Related items