Font Size: a A A

Semantic Based Information Retrieval From Semi-structured Documents

Posted on:2006-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X D YanFull Text:PDF
GTID:2178360212967470Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of World Wide Web, information on the internet expands greatly. Much of the information is semi-structured. However, it also comes up with new question: how to efficiently and accurately acquire the information to satisfy the user indeed information need. Unfortunately, so far most of the research efforts were focusing on extracting semantic information from semi-structured documents and only a little work was concerned with the search in the semantic information (i.e. semantic search).In this paper, based on the classic vector space retrieval model, we propose a semantic based information retrieval model from semi-structure documents for accurately retrieve semantic information that is extracted from semi-structured documents by information extraction methods. Related work usually applies the traditional full-text retrieval techonology into the semantic search. The full-text retrieval method can achieve good performance in document retrievel. However, it cannot achieve accurate results on the semantic data. The model, proposed in this paper, is an extended version of the state-of-art document retrieval model: Vector Support Model, and based on the tri-level index structure of the structured semantic information, design and implement the algorithm of semantic similarity, so it can satisfy the diversity and precision of users information needs.Semantic information is usually organized according to a schema predefined by domain expert. An average user might not be able to understand the retrieved semantic information. In this paper, define the issue of knowledge summary for the semantic information as a process which translating the semantic information described by ontology into a representation by nature language; with a divide-and-conquer strategy, we propose the algorithm for knowledge summary by using a nested template.Finally, based on the research mentioned above, we have implemented an information retrieval prototype system for semi-structure documents. Then integrate it...
Keywords/Search Tags:Semi-structured document, Semantic view, Knowledge summary
PDF Full Text Request
Related items