Font Size: a A A

Information Retrieval System Based On Document Query

Posted on:2006-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q HangFull Text:PDF
GTID:2208360152992712Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, people rely more and more on the Web, which is an enormous knowledge base. Search engines, known as the interface to users when they use network information, have been in development since it emerged. Meanwhile, with the popularization of computer and the increasing need of people's reading using computers, the digital library and the relevant technologies spring up and have been growing. Obtaining query from documents and submitting the query to the search engines has been the one of the common patterns of the modern information searching and knowledge obtaining. At present, these two methods of information obtaining are used separately. It is very significant to combine document browser and the search engine together to make users obtaining information more timely and effectively.Nevertheless, the existed search engines can't well satisfy the users' need. The main reason is that the information carried by the query is so little that seriously reduce the precision of information retrieval. Query expansion is an effective way to solve the problem of ambiguity of query. The relate theory and technology about query expansion in domestic and overseas are summarized and analyzed. Then in allusion to the search scenes in the user's prosecution of information searching, an algorithm for extracting the query's context information that based on single document is proposed. And an effective interactive query expansion system that based on the context of single document is implemented. The significant research contributions that come out of the dissertation are:(1) Embedding an information search system in the document browser. By using DLL of Visual C++, a plug-in is designed to embed an interactive query expansion system that based on the context of single document into a word browser.(2) Presentation of an algorithm for extracting the query information based on the context of document. The algorithm combines the global analysis with the local analysis to extract the context information of the user's marked query from single document. By global analysis, keywords are extracted from the whole document to reflect the user's research preference. While in local analysis step, query is disambiguated by extractingkeywords from the text that is around the marked query.(3) Designing an UI for the interactive query expansion. In order to improve the search precision, a friendly UI is designed to benefit user choosing or correcting the query expansion keywords.(4) In order to use the information on the network, the parallel Meta search engine of multi-thread is proposed, and the problems in the design and implementation of that are mainly discussed.The system is tested based on the English electronic documents of different knowledge fields. It is proved that the user interface, precision and recall etc. of information research are improved highly after utilizing the techniques above.
Keywords/Search Tags:Information Retrieval, Query Expansion, Context Information of Query, Interactive Query Expansion, Meta Search Engine
PDF Full Text Request
Related items