Font Size: a A A

The Design And Implement Of Information Retrieval System Based On Time Range

Posted on:2009-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q SunFull Text:PDF
GTID:2178360272458963Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the 1990s, the rise of the Internet has accelerated the spread of information and knowledge. In recent years, with the popular of computer and high boosted performance of hardware, the text information presents a rapid expansion. Large-scale information retrieval system has made a great help for people who want to find some information they need. These technologies relevant with information retrieval have been the focus of study. Such as, the index structure and its construction algorithms, index compression and maintenance methods, document scoring model, query feedback and expansion, top-k, high-performance query processing algorithms, and so on. These developed technologies provide a solid foundation for the development of information retrieval system.But with the lapse of time, information has kept been accumulated continually as historical data, in which people are gradually interested. This demand for mining historical data is growing significantly, especially with the development of web 2.0 in recent years. The situation that internet users constantly updated information in various communities and their own blogs greatly mounted up the quantity of the data which accelerated researching in the area. At present, researchers have been aware of the issue and proposed some solutions.This paper makes a survey of the basic principles of information retrieval system dives into details of how to implement the major components in the IR system. A new index structure and the associated query algorithm are proposed to efficiently support the retrieval in arbitrary time frame in the frequently updated text environment. The paper also demonstrates a retrieval system targeted to college community. This paper makes the following contributions:Propose a high-performance index organization for retrieval in arbitrary time frame;Improve the index compression algorithm in the new index structure;Analysis the features of retrieval model, employing a simplified model of NRA-Okapi to effectively support the top-k query in text retrieval system;Evaluate the above methods in the corpus from TREC 2006 Genomics Ad-hoc; Design and implement a retrieval system targeted in college community;...
Keywords/Search Tags:Information Retrieval, Query Processing, top-k, NRA-Okapi, Time Range, Retrieval Model
PDF Full Text Request
Related items