Font Size: a A A

Research On The Language Model Based Information Retrieval System

Posted on:2005-02-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:1118360122493283Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The language model based IR paradigm brings the IR technology into a promising and challenging new world. Though this paradigm has many advantages compared with the traditional IR models, there still exist many problems need to be explored. In order to overcome the existing disadvantages in the current methods, we begin our research work from the theory and the application perspectives and achieve the following achievements:(1) We proposed the trigger language model based IR system. Firstly we compute the mutual information of the words from training corpus and then get the triggered words collection of the query words to find the exact meaning of the word in specific text context. The relative parameters were introduced into the document language model to form the trigger language mode based IR system.(2) Two relevance feedback strategies are proposed under the Kullback-Leibler IR framework. One is called the mixture language model feedback method while the other is named the probability of vocabulary importance feedback. These feedback approaches can be naturally combined into the language model based IR paradigm and help increase the performance of the IR system.(3) We proposed a topic-based approach to language modeling for ad-hoc Information Retrieval. An improved two-stage k-means clustering method is designed to deal with the document collection and the clustered results are regarded as the topic information contained in the collection. Through combing the aspect model and text clustering technology, we can derive a more accurate document language model for ad-hoc Information Retrieval.(4) We introduce an object-oriented framework for text information retrieval applications named "AFFIRM" which contains many design patterns in the important parts of IR system such as the index, IR model and feedback approaches. This paper describes the main design patterns that contribute to this object-oriented architecture, revealing the framework's structure and the forces that shaped it. Using this framework we fix a basic architecture and thus increase ability to construct text information retrieval applications.Academic publications about the language model based IR system from other research group were not found in mainland now. This paper resolves some existing problems in the domain and contributes to the further exploration in the theory and application of the language model based IR system.
Keywords/Search Tags:Information Retrieval, Trigger Language Model, Relevance Feedback, Mixture Language Model, Topic based Language Model, Software Framework
PDF Full Text Request
Related items