Font Size: a A A

Design And Implementation Of A Document-oriented Full-text Retrieval System

Posted on:2014-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y H GuoFull Text:PDF
GTID:2268330422962227Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the deepening of informationization, almost all of documents are saved aselectronic document to be more convenient for saving, carrying and querying. However,retrieval systems based on full-text database needs a database to store the documents. Andother full-text retrieval system cannot parse some common-typed files and theperformance remains to be improved.In document-oriented full-text retrieval system, firstly, I summarize and analyze thedemand of system, then I decide that the architecture is Browser/Server architecture, anddivide it into four functional modules. There are user interface module, document indexmanagement module, document retrieve module and result presentation module. Secondly,I give a detailed analysis of the similarity scoring algorithm based term frequency andInverse document frequency. At the same time, pointing out its defects in the documentsimilarity scoring. In order to optimize the result, I raise two measures. One is whole-wordmatching, and the other is the adjacency of lexical item. Then, describing theimplementation process of the various functional modules in system. In document indexmanagement, I implement it in two aspects. One is the management of the table ofdocument index in database and one is document index operations in disk. While indocument retrieve module, I put forward the process of its implementation, and also theimproved algorithm. At last, I design some documents and other data to test theperformance of improved algorithm and the functions mentioned in system.Finally, experiments show that document-oriented full-text retrieval system can meetthe needs of users in function, and that the improved algorithm has positive effect on theretrieval results in some degree.
Keywords/Search Tags:Full-text Retrieval, Document, Similarity, Scoring, Rank
PDF Full Text Request
Related items