Font Size: a A A

Design And Implementation Of Multi-source Document Of Full-Text Search System

Posted on:2010-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y F FangFull Text:PDF
GTID:2178360275979984Subject:Education Technology
Abstract/Summary:PDF Full Text Request
Full-text search is a rapid and efficient information retrieval technology.It greatly improve efficiency that people search of specific information in from a large number of complex data.Although the Text Processing Technology has a great development and application,there are some questions to be examined:How to extract the text of an effective;How to extract meta-data information from the document;How to improve the accuracy and the recall rate of text search.This study is the design and implementation of the source of many documents full-text search system.The mainly Points are according to user's query request to express the database,organization,indexing and query of entire document.To retrieve the relevant information from the document database.The central aspects are text analysis,set up the index database,inquiries to obtain information,search the results of treatment and Match-related information.The main research work embodied in:(1)Analyze and summarize the technical and theoretical of the source of many full-text search system documentation to build.To research and describ in detail of chinese word segmentation technology,full-text indexing technology,the demand for user-oriented retrieval and content-based metadata description technology.(2) Design and analysis the structure of multi-source document of full-text retrieval system.Put forward an effective solutions model which from the full text of content analysis for the chinese problem,multi-source document conversion problems,Chinese word segmentation problem.The study includes the structure of full-text retrieval system, modular design features and index structures and database design.Research has focused on word segmentation,indexing and retrieval module of the analysis and design.(3)The key technologies on Research of Multi-source Document of Full-Text Search System.Conducive to a thorough analysis of queries entered by the user,to ensure the quality of query results,Back the most desired results to the user,while search results can also have the flexibility of word segmentation.(4)Implementation of Multi-source Document of Full- Text Search System. Merit-based selection of the Java language more sophisticated technology to the Struts framework to develop the level of planning system,combination of Unified Modeling Language and UML design methodology flow chart,programming and implementing the various functions of the system.the characteristics of The thesis embodied in:Multi-source document format conversion and document analysis,metadata extraction algorithm optimization technique. The recall rate and the accuracy of the search system have effective improve.
Keywords/Search Tags:Multi-source documents, Index, Meta-data, Search
PDF Full Text Request
Related items