A Study of Document-Context Models in Information Retrieval

Posted on:2012-11-01

Degree:Ph.D

Type:Thesis

University:Hong Kong Polytechnic University (Hong Kong)

Candidate:Wu, Ho Chung

Full Text:PDF

GTID:2458390008991164

Subject:Computer Science

Abstract/Summary:

In this thesis we study new retrieval models which simulate the "local" relevance decision-making for every term location in a document, these local relevance decisions are then combined as the "document-wide" relevance decision for the document. Local relevance decision for a term t occurred at the k-th location in a document is made by considering the document-context which is the window of terms centred at the term t at the k-th location. Therefore, different relevance scores (preferences) are obtained for the same term t at different locations in a document depending on its document-contexts. This differs from traditional models which term t receives the same score disregard of its locations in a document.;A hybrid document-context model is studied which is the combination of various existing effective models and techniques. It estimates the relevance decision preference of document-contexts as the log-odds and combines the estimated preferences using different types of aggregation operators that comply with the relevance decision principles. The model is evaluated using retrospective experiments to reveal the potential of the model. Besides retrospective experiments, we also use top 20 documents from the initial ranked list to perform relevance feedback experiments with a probabilistic document-context model and the results are promising.;We also show that when the size of the document-contexts is shrunk to unity, the document-context model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights. Thus TF-IDF term weights can be interpreted as making relevance decisions. This helps to establish a unifying perspective about information retrieval as relevance decision-making and to develop advance TF-IDF-related term weights for future elaborate retrieval models.;Lastly, we develop a new relevance feedback algorithm by splitting the ranked document list into multiple lists of document-contexts. The judgement of relevance of the documents is not done sequentially. This is called active feedback and we show that our new relevance feedback algorithm obtained better results than the conventional relevance feedback algorithm and this is done more reliably than a maximal marginal relevance (MMR) method which does not use document-contexts.

Keywords/Search Tags:

Relevance, Document, Models, Retrieval, Term, Feedback algorithm

Related items

1	Research On Relevance Feedback And Long-term Learning In Content Based 3D Model Retrieval
2	Studies On Affinity Propagation Based Pseudo-Relevance Feedback And Document Expansion For Spoken Document Retrieval
3	Study On Relevance Feedback In Image Semantic Retrieval
4	Research On The Relevance Feedback Based On Log Learning For Image Retrieval
5	Relational information retrieval: Using relevance feedback and parallelism to improve accuracy and performance
6	Research On Pseudo Relevance Feedback Based On Document Similarity
7	Research On The Relevant Feedback Algorithm In Information Retrieval
8	Research On Multi-modal Web Image Retrieval
9	Modeling Topic-based Semantics For Information Retrieval Models
10	The Research For 3D Model Retrieval And Feedback System