Research On The Method For Query Term Weighting In Information Retrieval

Posted on:2012-11-23

Degree:Master

Type:Thesis

Country:China

Candidate:X L Yan

Full Text:PDF

GTID:2178330335472276

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of the Internet, the amount of resource accessible to people extended greatly, far exceed beyond the human ability to manually process. This situation made the technology which can fast and precisely locates the information an urgent need. Information retrieval is a field emerged aiming to satisfy this need. It focuses its attention on every aspect concerning information, ranging from the representation and storage to the organization and acquisition. This thesis made its standpoint on the query representation for information retrieval. It has been observed that short queries usually perform well than their corresponding long versions when submitted to the same retrieval engine. This is mainly because most of the current retrieval models taking the terms in the query as equally important. This makes the documents that apt to non-important terms ranked higher than they should be and, relatively, makes the others lower and finally hurts the retrieval performance. This thesis focuses its attention on this drawback of the traditional method and tries to distinguish the importance between different query terms that represent the user's information need. By utilizing this information, it can finally enhance the retrieval performance. The central framework of the method adopted in this thesis is the hidden markov model, which we will discuss in detail in later chapters. We will show the advantage of integrating this model with the tradition IR model to handle the problem by a large body of experiments and finally find the optimal configuration. Experimental results show that the method can assign most of the terms to their corresponding weighting level precisely and we will see that even mapping these weighting levels linearly to the real-valued weights in retrieval model, improvement under significant statistical test (t-test) on the final retrieval performance can be observed consistently. This shows that our method can do effectively capture the weighting information embedded in the sentence structure and also further potential of our method.

Keywords/Search Tags:

Information Retrieval, Term Weighting, Hidden Markov Model, Concept Importance

PDF Full Text Request

Related items

1	An Information Retrieval Graph Model Based On Term Importance
2	Research On Markov Graph Model In Information Retrieval
3	The Information Retrieval Model Based On Markov Concept
4	A Multi-layer Markov Network Retrieval Model Fusing The Importance Of Term
5	Hidden Markov model of risky term structure: An application to Brazil
6	The Research Of Compliance Testing Technology Of Traffic Terminology And Standards
7	Incorporating Intra-query Term Dependencies In An Aspect Query Language Model
8	The Study Of Concept Mining In Information Retrieval System Based On Concept Lattice
9	Research On Geometric Similarity Of Machine Parts By Hidden Markov Model
10	A Study Of Concept-based Information Retrieval Model