Font Size: a A A

Research On Based On Scalable Object Mass Storage System

Posted on:2007-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:1118360242961898Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Information storage is an eternal need in the human society. With the development of computer technology and the popularization of its application, the current network storage system can no longer satisfy the increasing demands of storage as the information storage capacity increases explosively. Object Based Storage (OBS) is an emerging technology and will become the next wave of network storage technology. It will achieve an unprecedented high throughput through a simple method using the existing component, processing and network technology. Using object-based interface containing data and attributes, OBS captures the features of both the high speed of block-based interface and convenient data sharing of file-based interface. It has separated the logical and physical views of the storage data, with the former unchanged and the latter offloaded to the Object-based Storage Device (OSD). It also divides a file into separate a series of data objects that are striped across an OSD or multiple OSDs. However, although the object brings a new concept for storage system, as a variable-length data unit in the existing object storage systems, its meaning as a rich object is limited.The Based on Scalable Object Mass Storage System (BSO-MSS) takes advantage of OBS and extends the concept of object. It includes not only the user data, but also the directory, file and storage device, forming a hierarchical object storage system architecture. It realizes the object distributed storing and hierarchical managing. It builds a compatible object access mode based on the storage object, which harmonizes block-based, file-based and object-based interfaces. In this way, BSO-MSS can provide unified logic view, data sharing, active service, parallel access, unified storage and easy management. And it possesses a high scalability and performance that has never existed before.The Generalized Stochastic Petri Net (GSPN) model is set up to evaluate the performance of the BSO-MSS. The simulation result shows that the system performance improves with the increase in the number of both the storage objects and clients. Moreover, iozone is also used to benchmark BSO-MSS and Lustre. The result demonstrates that the writing performance of BSO-MSS is superior to that of Lustre, and its reading performance is only slightly better. It also validates the GSPN model of BSO-MSS.The storage system is first combined with Cellar Automata. The theory of Cellar Automata is used to analyze the dynamic behavior of the BSO-MSS, and construct a general model frame of BSO-MSSCA which provides the foundation for the analysis of two cases. Taking a SO as cell, the SO-based load assignation model, simulates a simple dynamic movement of load balance, and highly generalizes the evolving course of BSO-MSS. The DO-based access behavior model analyzes the effect of the access frequency of Data Object (DO) on storage system, and then suitably adjusts the access frequency of data objects in accordance with their characteristics through mechanical study and activity so as to improve the system stableness. Through analyzing SO-based load assignation model and DO-based access behavior mode, we can easily learn that the BSO-MSS is an object storage system of self-management with characteristics of activity, sharing, parallelism and relativity.In large distributed storage systems, it is critical to research on high performance metadata service, load balance and scalability. In the metadata server (MDS), metadata is divided into directory object and file object. The former is a locating metadata, providing the file location and access control; the latter is a descriptive metadata, depicting the data of file. Each MDS is responsible for the whole directory objects and its own file objects so as to make full use of the Cache, improving the hit rate of Cache, reducing the number of disk I/O, and extending MDS dynamically. Meanwhile,The hash value of key including directory object ID and file name is used as an index to the local metadata lookup table (LMLT), obtaining the corresponding metadata server id. It will not cause the metadata migration in case that directory permission is changed, directory or file is renamed, directory is moved, or permission is modified. Through Bloom Filter algorithm, the LMLT of each MDS can be compressed into a brief to achieve rapid metadata finding. The metadata service of master/slave/standby chain structure is also used, which not only ensures the high reliability and availability of the system without increasing the hardware costs, but also enables the access migration for hot spots and achieves the load balance.The SO is an important unit of BSO-MSS. It differs from OSD in that it is identified with"interface"and"state", consisting of data, attribute and method. So T10 OSD protocol is modified. As data object has been usually named in the one-dimensional space, the efficiency of traditional file system was very low to manage a large number of data object. After the linear hash Lookup algorithm has been adopted, the load factor controls splitting and merging. Compared with the tree-structure in the tradition file system, the lookup time complexity of the hash algorithm is O(1).Using Ext2 file system for object storage, one needs at least two data access to operate disk. The combination of block address and block length forms the extended attributes of object, stored on the disk along with data object. No matter what size data object is, only two disk accesses are neededThe workload is related to many factors in BSO-MSS, such as the length of the request queue, CPU processing capability, memory size, network bandwidth, disk bandwidth and disk capacity. Considering the differences between SOs as well as the influence of network, workload flexible layout strategy sets weight to allow SO of greater weight shouldering more workload. Based on the workload characteristics of the attribute information statistically calculating in the SO, the number of SO can be adaptively selected at the expense of system response time, with strips of different sizes to store data. It shows that the BSO-MSS is of higher performance, scalability and better load balance.
Keywords/Search Tags:Mass Storage, Object-Based Storage, Scalability, Cellular Automata, Metadate Management, Object Placement
PDF Full Text Request
Related items