Font Size: a A A

Research And Implementation Of Uniform Data Access And Management In Data Grid

Posted on:2004-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:B HuangFull Text:PDF
GTID:2168360152457110Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Scientific computing in the future will be data-center computing as the volume of scientific datasets increases explosively. There are a lot of heterogeneous data sources which reside in many administrative domains in data-center computing, so it is very difficult for users to access them using current technique. Data Grid has denoted a network of massive storage resources, from archival systems, to Caches, to databases, which are linked across a distributed network. One of the core problems that any Data Grid project has to address is the heterogeneity of storage systems on which data are stored. This diversity is made explicit in terms of how data sets are named and accessed in these different systems. So a new data access method in Data Grid should be provided by which users and even most package developers should neither know nor care whether data come from the grid or from local file system, nor whether data reside in object databases, in ROOT files, or in ASCII files.I have studied on the uniform access and management in Data Grid and implemented a Data Grid system named GridDaen in which users can transparently access many kinds of data resources. After analyzing some famous Data Grid project and studying on some key technologies of data access and management in Data Grid, in this paper, I introduce our GridDaen, and describe its design and implement of data access and management functions. Several implementation scenarios, from global naming, uniform view, local data access, access control, sharing access, a set of conventional uniform interfaces, and a common architecture to support these various scenarios, are also described.In this paper, I outline an evolutionary path toward incorporation of the distributed, heterogeneous data resources belonged to many diverse administrative domains, implement a uniform view which shield distributed heterogeneous properties of datasets in GridDaen by three-level naming method to jointly name them, and adopting collection to organize them, provide a set of uniform API to realize convenient transparent data access, introduce two-level Cache and data replication mechanism which can improve data access efficiency and realize time transparency and load balance, simplify authorization management by adopting role-based access control, and introduce two kinds of data replication in Cache to realize sharing access data, maintain replica consistency by which users can only write master replica ,then GridDaen broadcast to other machines for refreshing slaver replicas. I describe other implementation, such as sharing access, task schedule, state management of sever, access interfaces of local storage resources, etc.
Keywords/Search Tags:Data Grid, Uniform Access, Integrated System, United View, Replicating Data, Data Transfer
PDF Full Text Request
Related items