Font Size: a A A

Research On Query Expansion Of Information Retrieval

Posted on:2011-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2178360305477863Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology, the information on the internet increases exponentially. Huge amount of information fulfills people's life and brings them big puzzlement. An important research focuses on how to deal with these great capacities of information and acquire relevant data that we need. So, the search engine is coming. Techniques for query expansion have been extensively studied in information retrieval research as a means of processing the word mismatch between queries and documents or defective query expression. These techniques can re-construct or expand the query terms. Query expansion is an effective method to the queries in information retrieval. Researches on query expansion have become hotspot in information retrieval domain, for theoretic importance and practical meaning.Studies mainly including in the thesis are as follow:Firstly, this paper introduces the background of the research, the development of information retrieval and the query expansion, describes the content of the research. After that, we describer the theory of the information retrieval and the query expansion,Second, this paper analysis the effectiveness of several categories of query expansion, include boolean model,vector space model and probability model, proposes a novel model based on the web page structure, In this mode, we divided the contents of the text into title,bold,and body three blocks, and give different weight ratio for each block depending on the block in the document's location and the importance, readjusted the weight of word term to better distinguish the relevant documents and irrelevant documents, to improve the retrieval systems' detection performance.The last, this paper builds a novel query expansion model:query expansion of pseudo-relevance feedback based on users'query behaviors and web page structure, in this mode, using the improved vector space model which proposed in the previous chapter, according to the duration of user's clicking and browsing, or the existence of some querying behaviors, this algorithm can determine whether a document is related to users' query intentions and interests, automatically extract items which related to original query from the relevance documents, and collect terms related original query as expansion terms from the database. Experimental results show that the retrieval performance of the algorithm compared with others is improved remarkably.
Keywords/Search Tags:information retrieval, query expansion, retrieval model, recall ratio, precision ratio
PDF Full Text Request
Related items