Digital Library Information Search Key Technologies

Posted on:2011-08-13

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2208360305497636

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Information Retrievel is one of the most important technique of the DigitalLibrary.We will discuss the key technique of DL about Information Retrievel in this paper.We first introduce the backdrop of the study,the basis of relevant technique theoretics,and the overall design of the full-text searching engine.The main content of the paper is about the detailed design and implementation of every part of the full-text searching engine.The first part is about using HTTP and Java thread technique to program Spiders.This module uses BFS(Breadth First Search) to search the hyperlink in the web pages,then uses SQL DBMS to store the task queue and uses JDBC technique to access the DBMS.The second part is about the implementation of the Indexer on the basis of Lucene's Chinese segment technique and its API.This module integrates the text mining technique of the HTMLParser and TextMining tools.It can deal with many types of files,including HTML,TXT,WORD and PDF.The third part is about the implementation of the Searcher.Its function including English Search,Chinese Exact Search,Multi-Key Word Search,Secondary Search and Relevant Search.This module can do automatic analysis with query string and the main content of the index files,and highlight the text of the query results.Paper ultimately achieved full-text searching engine can not use the vocabulary segmentation, facilitate accurate retrieval,and to improve the retrieval speed and accuracy.

Keywords/Search Tags:

Bot, Spider, Full-text Search, Information Retrievel, Keywords Search, Lucene

PDF Full Text Request

Related items

1	Design And Implementation Of Search Engine In Digital Library Of A University
2	The Research Of Full-text Search Engine Key Technology Based On Lucene
3	Research And Design Of Search Within Application System Based On Lucene
4	Research And Implementation Of Enterprise Information Fulltext Search System Based On Lucene
5	Research And Implementation Of Website Search Technology Based On Ajax/lucene
6	Research And Implementation Of Website Search Technology Based On Ajax/Lucene
7	Research And Application Of Full-text Search Based On Lucene
8	Research And Implementation Of A Chinese Full-Text Information Retrieval Technology Based-on Lucene Search Engine
9	Study And Implementation On Full-text Search Engine Based On LUCENE Under The Large Amount Of Data
10	Research And Application Of Full-Text Retrieval System Based On Lucene