Semantic Positional Language Retrieval Models With A Proximity Information

Posted on:2015-06-23

Degree:Master

Type:Thesis

Country:China

Candidate:X L Gong

Full Text:PDF

GTID:2298330431498600

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the past few decades, there have been many classic models in the field ofinformation retrieval, such as the Boolean model, Vector space model andProbabilistic model. In1998, Ponte and Croft first proposed and applied Statisticallanguage model into information retrieval field, and also proposed the model namedquery likelihood language model that has gained rapid development in recent years.Consequently, many scholars have joined in this field of research. Based on the greatnumber of experiments, Hidden markov models, Statistical Translation models, therisk minimization for information have been proposed by researcher in turn.But most of the retrieval models have been proposed by the researcher whichbased on the frequency of words in the document, and do not consider the positionrelationship of the word in the document. Base on the issue above, Lv and Zhaiproposed positional language models, The biggest advantage of this model is that ithas considered the positional relationship of words in document, but this model alsohave some defects. And then,Yu and Wang made some advantages, propose a newmodel which named positional language models with semantic information. Andapplied it to information retrieval successfully. The retrieval section of this modelsused application by interpolation smooth method (Jelinek-Mercer) directly, also donot consider the position relationship of the query terms in the document. Therefore,this paper will do some works base on their work foundations. Recent studies showthat using match query terms’ position information in the documents can promote theprecision of the query results. How to better express the position information ofquery terms in a document and modeling is one of the problems about improvingretrieval efficiency. This paper studies takes a further consideration about termsproximity information on the basis of what combined with the semantic positionallanguage model(SPLM), we give a Dirichlet prior distribution as smoothing measureto compute proximity, and presented a semantic positional language retrieval modelswith a proximity information.Specifically, the main work and innovation of this article are as follows:1) Firstly, we considered different kinds of kernel functions. Eventually, thispaper resolve the problems use the Gaussian function to measure the positionrelationship in original model. At length, we give the thought about how to combinethe proximity compute model with language model.2) Ranking search results is a fundamental problem in information retrieval,based on statistical probability and algorithm of linear level complexity theory, wepropose a proximity SPLM retrieval model for information retrieval, this paper willaccording to the thought of combination between proximity information and languagemodel, and give a way of combination between proximity information and SPLMmodel that use Dirichlet smoothing method. Further more, we compare theperformance of our retrieval model to SPLM model systematically, also have an efficiency analysis between Dirichlet prior distribution method and JM smoothingmethod used in SPLM model.3) We do the experiments and it show that our retrieval model performs betterthan the SPLM model for using in information retrieval. Further, we give sensitivityanalysis of the parameters of our model, and comparison of the different proximitystrategies also it’s the part of our work. At length, make an analyzes in different typesof the proximity in level of complexity of the algorithm.

Keywords/Search Tags:

semantic positional language models, Dirichlet smooth, proximity model, retrieval model

PDF Full Text Request

Related items

1	Research In Chinese Information Retrieval Based On Positional Language Models
2	Positional Language Models With Semantic Information
3	Research On Information Retrieval Based On Language Model And Reranking For Retrieval Results
4	Semantic Based Retrieval For Heterogeneous3D CAD Models
5	Semantics-based language models for information retrieval and text mining
6	Research On Information Retrieval Method Based On Semantic Similarity Of Central Sentences
7	Research Of Medical Records Semantic Retrieval Method Based On LDA And LSA
8	The Research And Implementation Of XML Information Retrieval Based On Language Models
9	Investigation Of Categorical Semantic Information Processing In The Brain And Natural Language Processing Models
10	Using Statistical Language Modeling For Ad Hoc Information Retrieval