Font Size: a A A

The Study And Implementation Of Massive Data Storage And Organization In Digital Library

Posted on:2012-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:C H ShenFull Text:PDF
GTID:2178330332976236Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the rapid growth of multimedia resources, efficient data storage and data organization has become an important application in digital library distributed service system. Using characteristics of massive data and access habits of users, this paper studies and implements a mechanism of data storage and organization, which is could be suitable for large scale data set, to meet distributed concurrent demand of massive heterogeneous data in the digital library.The main contribution of this paper is as follow:First, through massive resource characteristics, CADAL's logs and analysis of replication mechanism of distributed file system, proposes a digital library architecture supporting efficient replica maintenance, to meet service need of high performance, high availability, high reliability and scalability. Second, proposes a distributed index framework which combines local index and global index, and a concurrent unified search scheme. Then, based them, builds an information retrieval architecture on the file system supporting pluggable and unified query to achieve efficient, accurate, flexible and reliable resource search in the digital library. Third, with document correlation clustering, proposes an integration idea for massive small file, of which the basic idea is merging data and building index file. Compared to storage directly, it significantly improves data storage and access performance, and increases system's IO rate.
Keywords/Search Tags:Digital Library, Massive Data, Distributed Storage, Distributed Information Retrieval, Small File
PDF Full Text Request
Related items