Font Size: a A A

Research On The Key Techniques Of Capability-Aware Active Storage In Heterogeneous Environments

Posted on:2017-06-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:1318330512954962Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Active storage uses the surplus processing capability of storage device to move part of computation function and management tasks of applications to the storage device and reduces impact of I/O bottleneck on system through processing data on the storage device itself. Under current background of big data, visits of many data-intensive application data are PB level and active storage is very important for solving I/O bottleneck problems of these applications.However, since the processor, storage medium and operating system are heterogeneous, heterogeneity exists in current active storage system whereas the current execution framework and data placement scheme of active storage are still designed for homogeneous active storage system. Hence, finding out the execution framework of active storage and the corresponding data placement scheme which are suitable for heterogeneous active storage system is of significant meaning to construction of high-efficient active storage system.In this paper, we study the heterogeneous active storage, from four aspects:1. We put forward a heterogeneous active storage framework based on Java Virtual Machine (JVM). Such framework uses JVM to run Java code so as to realize active storage function of users and then solve the execution problems of active storage on heterogeneous platform. Such framework has expanded the standard OSD model, downloads the active storage code from the clients to OSD through offering such four new interfaces as downloads, association, triggering and execution of codes and executes these active storage codes on OSD according to demands of users. At last, experiments verifying the structural framework of JVM-based active storage show that such framework can implement cross-platform active storage which can improve performance of system to a great extent.2. We put forward a storage capability-aware skewed data distribution (SDD) of active storage. This data distribution scheme has effectively solved the problem that heterogeneous storage devices wait for each other in implementation of active storage due to differences in their data processing performance. Different from traditional active storage system which uses even data distribution scheme, SDD places different amount of data on different servers according to the performance of each server. A cost model assessing overhead of active storage was put forward to measure the time overhead of different storage devices in the process of active storage so as to find out ideal data placement scheme and transform the issue of data placement optimization into issue of four linear programming optimization under some constraints. Such scheme can solve the performance problems of high-performance computing applications in hybrid active storage system. We then realized a prototype of SDD in parallel I/O system and verified the effectiveness of this data placement scheme through typical data processing application. The results show that this data placement scheme can remarkably improve performance of the whole active storage system.3. We put forward a computation capability-aware data placement scheme of active storage. Such method is a new data distribution scheme which distributes data in heterogeneous active storage system. First, this capability-aware data placement scheme evaluates the data processing capability of each storage node in a heterogeneous cluster, including computing capability and storage capability, in accordance with newly proposed holistic metric—processing ratio. Second, this capability-aware data placement scheme distributes the data amount on storage nodes in accordance with data processing ratio of each node to avoid load-imbalance among heterogeneous servers. We have realized prototype of capability-aware data placement scheme in parallel I/O system and evaluated performance of the typical applications. The experimental results show that capability-aware data placement scheme can significantly improve performance of active storage system.4. We put forward Adaptive Active Storage Scheduling (AASS) is put forward to dynamically arrange the active storage task and execute it on the storage nodes or the original computing nodes. In large-scale active storage system, different active storage task has different features and not all storage tasks are suitable for execution on the storage nodes. In order to determine the execution mode of active storage task, AASS has put forward an execution time mode of active storage task so as to assess the time overhead of active storage task when it is executed on the computing nodes and storage nodes respectively. In the execution process, AASS predicts the execution time of active storage task on different nodes in accordance with such execution time mode and adaptively moves the task to the computing nodes or storage nodes, on whichever the execution time is shorter, for execution so as to reduce the overall execution time of the storage task.
Keywords/Search Tags:Heterogeneous Active Storage, Platform Framework, Storage Capability-Aware, Computation Capacity-Aware, Data Placement Scheme
PDF Full Text Request
Related items