Font Size: a A A

A Stable Information Retrieval Algorithm And Its Application In Peer To Peer Network

Posted on:2004-02-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z F YangFull Text:PDF
GTID:1118360185496989Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Information presented in electronical form is avalanching nowadays as human knowledge accumulates and network applications become popular. It is easier for people to acquire information than ever, but they can never make use of them effectively without good searching facilities.Information retrieval (IR) technology can help people search in large amounts of text data. The statistical methods are heavily used in IR. They often depend on certain parameters. Optimizing parameters are quite tedious and ad hoc work. Firstly this dissertation tries to find a stable content and Web information retrieval algorithm which can achieve good performance on several test collections and retrieval tasks. It must also be smooth enough while its parameters vary in a small degree. The retrieval algorithms and processes found in this dissertation have shown good performance in Web Track of TREC 2002. They also perform well on the TREC 2001 test collection and topics with the same parameter configuration. Secondly the application of genetic algorithm for improvement of retrieval performance and parameter optimization process is investigated. If relevance judgment from NIST is prsented to the genetic algorithm, it converges and the convergence point is almost the best solution. When relevance judgment is absent, a voting algorithm is designed to replace the NIST relevance assessors. It generates the pseudo-relevance judgment needed by the trec_eval program, the standard TREC evaluation tool. The objective function in evolution process is tuned according to various considerations. The average result of the algorithm is satisfying. Thirdly the performance of the stable information retrieval algortim in P2P network is investigated. A new searching framework based on P2P network is presented and simulated. The stable retrieval algorithm achieves almost the same performance with single database solution in the simulated P2P network.
Keywords/Search Tags:Information Retrieval, Text Retrieval, Vector Space Model, Genetic Algorithm, P2P
PDF Full Text Request
Related items