Font Size: a A A

Research On Data Grid For Mass Information Process And Its Key Technologies

Posted on:2010-07-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:G LiuFull Text:PDF
GTID:1118330338985377Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology, more and more information enriches the idea of people and enlarges their field of vision. At the same time, it brings many difficult problems, which is listed as follows: There exits a large amount of heterogenous data resource, whose size and network environment are very different. At the same time, the information is lack of a uniform format. How to transform them into operable and standard data with fixed format according to the users'different requirement is also a formidable task. Because of the rapid speed of information updating, the change and synchronization of data also need to be resolved. How to guarantee the consistency of information with the limited bandwidth will make the mass information process application has the ability of access to the newest and the exactst data.Moreover, the ability of information use will be improved.This thesis presents the Data Grid for Mass Information Process (MIPDG) based on those problems and difficulties mentioned above. MIPDG, as the new data mamagement architecture, proposed a new mode of mass information process center and by providing a uniform standard description mode for different data formats. So the automatic mapping and automatic association will be implemented. The replica creation strategy, the replica coherence algrithom and data transfer algrithom can resolve the problem of information resource coherence sharement effectively. MIPDG provides a level of application of access support for mass information process, which can reduce the complexity of mass information process application. It will provide a high performance, giant capacity, high speed transfer, wide spread data share platform for share and utilization of information.The thesis analyses the architecture of MIPDG and its'key technologies in detail based on high performance, ease to use, and expansibility. The contribution of this thesis is composed of eight points listed below:1) MIPDG designs many distributed information center nodes combined with the charastics of mass information process. Also it sets up the uniform model of data access based on data center, which can overcome the access bottleneck problem caused by the difference of capacity, network bandwidth and utilization of data resources themselves.2) Based on the design of OO, it gives a definition of mass information process metadata and it carries out the flexible mechanism. Otherwise, it designes a five-layer model of data mapping based on service by utilizing the mode of metadata catalogue management, which implements the transparent and expanding mapping management of massive data. As a result, a uniform convenient data access mode is provided for the data access of different storage manner and different data formats.3) The model put forward a data transfer strategy called DRFT(Distribute Reliable File Transfer), which implements the automatic partition of data transfer process and management by the mode of job scheduler. Therefore, the aim at the automatic scheduler of data transfer without manaual work is achieved. The model also discusses the assignment of bandwidth and proposes three data transfer scheduler strategies. Morver, the best fit strategy is optimized, which can not only utlize the transfer bandwidth effectively, but also has more steady transfer speed.4) A Strategy of Dynamic Replica Creation Based on Clustering (DRCC) and a strategy of activity based multi-phase consistency maintenance algorithm are proposed. These two algorithms overcome the difficulties of the low data access efficiency caused by the limited network bandwidth. As a result, it reduces the average completion time and improves the usage of grid resource and the informance of grid environment.The feasibility of the whole grid system is validated by setting up the system of MIPDG in the experimental enviorment. The experiment results show that our replica creation strategy, coherence algorithm and data transfer algorithm is effective and correct. The average job response time and replica updation times are both improved. The results will be a valuable reference to the application of MIPDG and its strategy or algorithm in future.
Keywords/Search Tags:Mass Information Process, Data Mapping, Metadata, Replica Creation, Replica Consistency, Data Transfer Scheduler
PDF Full Text Request
Related items