Research On Document Representation Model Based On Query And Content

Posted on:2011-06-25

Degree:Master

Type:Thesis

Country:China

Candidate:Z Zhou

Full Text:PDF

GTID:2178360308977370

Subject:Computer application technology

Abstract/Summary:

With the rapid development of Internet technology, the online information increases exponentially. As the contradiction between the huge digital information and the ability for people to obtain it is increasingly outstanding, how to search relevant information quickly and accurately has become the hot spot of today's research in the field of information. In information retrieval, the quality of a document representation model is one of the important factors which affect retrieval performance. According to the comprehensive information theory, epistemology information is the trinity of syntactic information, semantic information and pragmatic information. The mainstream of document representation models at present primarily utilize syntactic and semantic information while are devoid of pragmatic information, which is the bottle-neck of retrieval performance improving.The thesis begins with an overview of the classic information retrieval model and how their to represent document at home and abroad, and the theory of comprehensive information and epistemology information is discussed latter. It then introduces the application status of pragmatic information in the query expansion, sorting algorithms and document representation, and emphasizes on the document organization method based on query set. This thesis analyses defects of this method , then aims at these defects and gives a concept of Stability Criterion for Query Sample Space, It proposes a document representation model based on users'query behavior and documents'content, in which the pragmatic information from users'implicit feedback and the semantic and syntactic information from documents is integrated to dynamically regulate the key-weight of index database, this model can consequently improve recall and precision rate in information retrieval. Experimental results show that our new model express documents'topic information well and significantly improving retrieval accuracy.This thesis also propose a document representation model based on co-occurrence query and on co-occurrence content aims to dig the deep level information of co-occurrence words, and then co-occurrence words'extraction and new model'formal description is given. Finally, a website search engine for the news network of **university is developed, which is based on the Lucene architecture and can real-time trace users'profile and dynamically regulate retrieval results according to the migration of collective profile.

Keywords/Search Tags:

Information Retrieval, Document Representation Model, User Query Log, Implicit User Feedback

Related items

1	Research On Personalized Document Retrieval Technology
2	Research On Query Expansion Algorithm Based On User Interest
3	Research And Implementation On User-Oriented Query Expansion
4	The Research Of Intelligent Database Selection Based On User Model
5	Study On User Study Process Oriented Query Expansion Methods
6	A Study Of Dynamic User Intention For Recommendation With Implicit Feedback
7	Document Ranking Methods For Supporting Implicit Temporal Queries In Information Retrieval
8	Understanding implicit feedback and document preference: A naturalistic user study
9	Information Retrieval And Query Recommendation For Information Precise Service
10	Research On Semantic Processing Technology Based Information Retrieval Model