Font Size: a A A

The Research Of Index Techonology Based On Semantic Web Document

Posted on:2011-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:H Q JiangFull Text:PDF
GTID:2178360305954033Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The current Semantic Web search engines have the same interface style to the traditional search engines, use the search mode that is familiar to the internet users to search the keyword, and provide the retrieval services for Semantic Web Document. But its search results is also similar to traditional search engines that search results include thousands of the Semantic Web Document that contain the searching keyword, and the user must filter the results to obtain the desired information that increases the burden on the users. So, it may be one of the problems should be solved for the Semantic Web search engine to provide an index model that based on the semantic content and realized the knowledge integrationIn this paper, we propose an index model that makes full use of the semantic information to realize the knowledge integration under the Semantic Web environment, and with the semantic search engine-Sniper which based on the knowledge integration for testing background. The index model is composed of three module: the index of semantic data that support the knowledge integration, the index of paths that based on the paths of ontology and realize the semantic query expansion for the multiple query keywords, and the index of entity clustering that cluster the entities that are semantic similar and improve query performance of semantic search engine.Creative and improving exploration has been made as following.(1) We explained the classification and the method for parsing the Semantic Web Document which based on the analysis on the character of the Semantic Web Document, and store the semantic data into storage model after parsing.(2) We propose the index of semantic data that support the knowledge integration based on the analysis of data in storage model and after ontology mapping. The index of semantic data contains the information about entities after mapping and the basic information of entities.(3) We propose the inverted index of path and the index of PS-Tree based on the research and discussion on the paths of domain ontology. The inverted index of path and the index of PS-Tree can realize the semantic expansion for multiple query keywords. The inverted index of path is constructed by indexing the entity and path and can provide the paths for query keywords. PS-Tree improve query performance in two ways for the path, on the one hand, building the inverted index of PS-Tree forest, on the other hand, PS-Tree is a range tree for searching. Experiments show that, the index of PS-Tree is better than the inverted index of paths in the storage capacity of index file and the query response time.(4) The index of entity clustering. First, the index of entity clustering classifies the entities according to the semantic distance of entities in the domain ontology, then hierarchical clustering for the result of classification, and builds the index for the result of hierarchical clustering. Cluster the similar entities into a cluster is to improve the possibility that the semantic related query keywords in a cluster. Build index for the result of hierarchical clustering in order to quickly locate the position of the cluster that contains the multiple query keywords and transfer the semantic data of cluster from external memory into memory, and speed up the process of query. Experimental results show that the index of entity clustering decrease the times of data I / O, and improve the query performance of semantic search engine.This index model is one of core modules of the semantic search engine-Sniper that based on the knowledge integration. The index of paths that based on domain ontology is not only for the particular application, it makes a universal significance for the research to the semantic extension of multiple keywords. Experiments show that the index model has provided the strong support for the fast query to the semantic search engine.
Keywords/Search Tags:Semantic Web Document, Index of Semantic Data, Inverted Index of Path, PS-Tree Index, Index of Entity Clustering
PDF Full Text Request
Related items