Font Size: a A A

The Application Of Summarization For In Kullback-Leibler Retrieval Framwork

Posted on:2012-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:P JiangFull Text:PDF
GTID:2218330368481857Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the great development of Internet technology, how to get exact information rapidly and expediently is becoming a seriously problem which should be solved urgently. The text information retrieval system has already become a tool which people obtain the useful information indispensably with. The model of information retrieval is regarded as the mathematics foundation of information retrieval technology. It is one of the main research directions, and has more important meanings.As a tool of natural language processing, statistical language models have already been proved that has great effect. The proposition of Information Retrieval model formed after statistical language model combined with Information Retrieval has great progress at information retrieval area.While the Information Retrieval is an effective way to acquire the required information, as a branch of natural language processing technology, the Automatic Summarization can reduce the burden of reading; help people extract the main relevant information. It favors the Information Retrieval and re-processing, the simplicity and clarity of whose performance is an effective means for information mining. In order to satisfy users' information needs, researchers combined information retrieval technology with the automatic summarization. The main research objects of this paper are the IR models which are based on the statistical language retrieval framework and summarization.Specifically, the main contents of this paper are shown as below:This paper simply introduces the principle of information retrieval, and reviews the classical information retrieval models. Then, this paper analyzes the statistical language model and the models extended from it. The smoothing methods are also been analyzed in this paper.This paper discussed the abstract classification and corresponding abstract produce method, and analyzes the advantages and disadvantages of each abstract generation method. Due to the need for a robust abstract generation method to deal with the different types of documents we may meet in the information retrieval process, this paper use document surface features information, such as term frequency, position, title etc, to calculate the weight of sentence, the summarization is composed with the option important sentencesThis paper further investigates how to utilize the summarization in the statistical language framework, propose a corresponding retrieval model. Then this paper compares the model with the method proposed by other researchers. The experiments based on the TREC collections show that the model in this paper has outperformed the simple language model significantly.Query expansion is a useful technology for feedback. Based on the K-L divergence retrieval framework, we propose a new strategy for query expansion, e.g. using summarization for query expansion.
Keywords/Search Tags:information retrieval model, statistical language model, summarization, document expansion
PDF Full Text Request
Related items