Font Size: a A A

Ontology Based Cross Language And Full Text Information Retrieval

Posted on:2007-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2178360185467618Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays domestic enterprises trend to internationalization and multiple languages are used in enterprises, so enterprises take more and more attention on Cross Language Information Retrieval (CLIR). There are some CLIR systems at present, but they have low performance because of using traditional retrieval strategy. In traditional CLIR system matching literal is the core in the process of searching, so computer can not know the inner meaning of the query. And traditional CLIR often brings us plenty of information garbage, so we have to take time to exclude it. Just those reasons lead to low precision and weak recall. Therefore enterprises need intelligent CLIR urgently. For this reason, we present a new CLIR model which is based on domain ontology. The model can not only accomplish cross language information retrieval, but also comprehend the inner meaning of the query. So it brings information which is closely-related to the query.We made deep study about many search engines, and we chose Lucene as our retrieval foundation. Lucene is an open source, high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for multiple languages. But, because of his feature about full text searching, Lucene has the shortcoming that it can not know the inner meaning of the query. In view of this question, we used the advantage of ontology in describing things to extend the query, so computer can do good searching by catching the inner meaning of the query. We built intelligent search model whose domain is tourist industry. The document set is from sina and yahoo. Our method is evaluated by average precision/recall curve. In contrast with traditional CLIR, the...
Keywords/Search Tags:Lucene, Jena, Protégé, CLIR (Cross Language Information Retrieval), Information Retrieval, Ontology
PDF Full Text Request
Related items