Font Size: a A A

Design And Implementation Of University Digtil Library System Based On Hadoop

Posted on:2015-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y DongFull Text:PDF
GTID:2308330473950813Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology, we have entered the information age nowadays. Electronic publications prevalent and network data grows in the form of explosion. A vast amount of data with multiple types has brought geart difficulties to information retrieval and severely restricts the data utilization. In this background, digital library, as a novel data organization and management form, has attracted more and more attention. Digital library is a kind of no time and space limit, easy to use and very large scale knowledge center. Taking digital data as management object, digital library, supported by distributed stroe, information retrieval and internet technologies, adopts unified knowledge system to save and transmit mutilmedia information.This thesis focuses on the research of the design and implement of the university-oriented digital library system based on Hadoop, which is an important part of morden library system. Firstly, in the step of requirement analysis, based on library management theory and the pratical application background of digital library, we give the definition of system scope and content thought the analysis of domestic and foreign excellent library systems and practical research. Then, in the stage of system design, we introduce the system structure, business process and data model. In detail, the system adopt B/S three layer architecture, model the process though process graph and build meta data model based on relation database. Finally, we adopt MVC framework to develop the system, including 4 improtant parts: metadata extraction, data storage, index strategy and system functions for users. Detailly, we extract metadata from electronic documents based on heuristic rules; bulid hieratchical storage architecture for all the electronic files and create index based on full text methods and metadata. In Hadoop platform, we adopt Lucene to create full text index for documents and use metadata as index for other files. Finally, we design query optimization technology based on download and query records in the system functions.In the end of this thesis, we we test the system and analyze the test result.The system not only can help university manage the current digital information to improve the presion of search, likely teaching vedios, electronic journals, electronic documents, and research data, but also can scan the exsiting books into digital data and integrate network resources to implement a unified information resource platform and impove the efficiency of knowledge acquisition.
Keywords/Search Tags:Digital Library, Metadata, Full Text Index, Lucene, Hadoop
PDF Full Text Request
Related items