Font Size: a A A

Conceptual Graph Based Text Retrieval In Specified Domain

Posted on:2009-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:B S WuFull Text:PDF
GTID:2178360242476752Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Text retrieval is an important part of information retrieval. So far almost all the search engines use keyword-based retrieving methods, whose core is matching the critical characters mechanically. One of the existing problems is the low recall and precision, which leads to the unsatisfactory performance. Concept-based retrieval uses natural language processing to extract all kinds of conceptual information in the documents and makes full understanding of user queries, which can provide better retrieval results and effectively make up the defects of keyword-based retrieval.This paper mainly research on the indexing of documents and user queries and the matching algorithm in the concept-based retrieval. Our work is as following:First, extend the definition of Sowa's conceptual graph, and propose a new indexing form of texts, Recursive Conceptual Graph, which is more suitable for automatic analysis of natural language. This formalism not only emphasizes on the concepts in the texts, but also specifies the semantic relation among them, because of which it is really an indexing method on semantic level. Meanwhile, we put forward a matching algorithm to calculate the similarity between the conceptual graphs of documents and those of user queries, according to which retrieval results can be ranked.Second, set up the conceptual structure in the specified domain of"Yacht", which supports both the conceptual indexing of texts and the similarity calculation between documents'conceptual graphs and the user queries'ones. We accomplish this part of work in two steps: extract the concepts in the 200 texts'titles of"yacht"domain, and construct them in to a conceptual taxonomy; then conclude the semantic relations among these concepts, and add them into the conceptual structure with a marker.Third, implement our retrieval model on the computer. We adopt a few technologies to raise the efficiency of retrieval, such as using XML to represent the conceptual graphic indexing of texts and making use of the hash table to speed up the calculation of concepts'similarity.Finally, make an experiment to compare our retrieval model with Boolean model on the text collection of"yacht"domain. From the experimental results, our retrieval model's performance (recall and precision) is much better on the most of the user queries.
Keywords/Search Tags:text retrieval, retrieval model, domain specified conceptual structure, recursive conceptual graph, matching algorithm
PDF Full Text Request
Related items