Font Size: a A A

Military Retrieval System Design And Implementation

Posted on:2010-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhengFull Text:PDF
GTID:2208360275991336Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of technology and economy,electric data of libraries,news publishers,corporations is increasing rapidly.At the same time,people are encountered with exploding information.Being an important technology in the field of information-management,Full Text Retrieval(FTR) provides technical support for people to obtain information exactly.During the process of military information construction,there are a lot of literatures too.In order to make a good use of the information, literatures were stored into Oracle database and indexed and searched by Oracle Text in early system.Although Oracle Text provides the function for searching literature,there are some disadvantages.First,since index and database are closely coupled,it consumes a lot of database resource to create index,which humbles efficiency of other database operations.Secondly,when information is kept in different databases, searching on kinds of databases is not available.At last,searching precision is affected due to the technology of inexact Chinese Segmentation within Oracle Text.Therefore,this paper designed a military library retrieval system based on existing technologies. While literature is still stored within Oracle database,searching and indexing are made through searching and enquiry service modules instead of making use of the Oracle Text index mechanism.Among which,the searching module keeps the searching information into file system,which achieves loose coupling of index and database.In addition,mechanism of inverted index and incremental index is introduced in order to improve efficiency.The searching service module extends the basic searches operation and implements the function of searching on several databases. It provides arithmetic to calculate the correlation degree of searching results.Searching results are sorted according to the correlation degree.This paper designed and implemented a framework of text conversion service and Chinese segmentation,which transacts literature beforehand in order to index it.The framework of text conversion service can obtain the content of literatures.Universal interfaces are designed for conversion arithmetic in this framework,which makes plug and play come true.In order to improve the precision of Chinese Segmentation,Chinese lexical analysis in the Chinese segmentation uses Cascaded Hidden Markov Model(CHMM) to identify new word.The module of data collection is designed to make index in file system and literature in database consistent.It collects the updated information of database as needed by the mechanism of initiative database.In addition,the system provides management service to manage the information about libraries and users.
Keywords/Search Tags:Full-Text Retrieval, inverted index, lucene, Chinese segmentation
PDF Full Text Request
Related items