Font Size: a A A

Study And Implementation Of Cross-language Information Retireval Technology

Posted on:2012-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y M NiuFull Text:PDF
GTID:2248330395455408Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the information retrieval area, the diversity of languages today’s massiveinformation uses, and that different people use different languages has led to languagebarrier problems when doing text-based searches. With the users demand to accessintegrated multilingual information, cross-language information retrieval technologygrew rapidly these years and will increasingly play an important role in the future.Since most of the current cross-language information retrieval techniques use thequestion-based translation method, one of the main works concentrates on improvingthe accuracy of translation in the cross-language information retrieval research area.After research of the key techniques of cross-language informationretrieval(CLIR), this thesis implemented a Chinese-English cross-language informationretrieval system base on Apache Lucene platform. This thesis presents the theory ofCLIR, current techniques, query translation methods and information retrievalapproaches. An information retrieval model based on query translation and semanticmapping is proposed for the resolution of candidate term ambiguity. ImprovedIndexing flow and results ranking algorithm prioritize search results matching.Finally, with a deep analysis of the structure and modules of Lucene, the design andimplementation of Chinese-English cross-language retrieval system called CLIRS isgiven. The effectiveness of cross-language information retrieval model andfunctionality of each module are verified.Experimental results showed that, the cross-language translation model andresults ranking algorithm used by the CLIRS system achieved better in bidirectionalChinese-English cross-language information retrieval.
Keywords/Search Tags:Cross-language Information Retrieval, Query Translation, Semantic Mapping, Lucene
PDF Full Text Request
Related items