Font Size: a A A

Design Of Energy-efficient Reconfigurable System Architectures For Data-intensive Computing

Posted on:2018-02-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1368330590955504Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the past decade,with the rapid development of mobile Internet,cloud computing and big data technology,data show explosive growth in different areas.Meanwhile,data-intensive computing is gradually replacing computation-intensive computing as the current mainstream computing paradigm.Due to the characteristics of low computation/memory access ratio and irregular data access in data-intensive computing,conventional computing architectures encounter a serious “memory wall” problem when dealing with data-intensive applications.On the other hand,the limited energy consumption constrain in data centers makes it impossible to meet the ever-increasing demand of data processing power by the unlimited scaling out of computing resources.Therefore,enhancing the energy efficiency of the computing system when processing data-intensive problems is the key to solving the “power wall” problem in data centers.In order to address the “memory wall” and “power wall” problems,we conduct the study of the energy-efficiency reconfigurable system for data-intensive computing.FPGA based reconfigurable computing system can take advantage of its customized parallel computing resources,high bandwidth on-chip memory and low power consumption,to improve the system computing efficiency for data-intensive applications.In this paper,we first analyze the performance bottleneck of data-intensive computing in different system structures by comparing the characteristics of the data intensive applications and the structure features of main stream computing systems.Then,we study the energy-efficiency reconfigurable system for data-intensive computing based on two typical data-intensive computing scenarios: sparse data computation and high concurrent data-interactive computation.For sparse data computation scenarios,we use large scale graph data analysis as an example to discuss the architecture design and optimization of energy-efficient reconfigurable system for sparse data computation.First,we propose a performance analysis model for sparse data computation.Compared with the Roofline model,our model not only takes into account the computation capacity and memory bandwidth,but also considers the effect of memory access latency on overall computing performance.According to this model,we conclude that the largescale graph data analysis is not only memory-bandwidth-bounded,but rather memory-latencybounded.Then,for algorithm optimization of the sparse data computation,we propose a finegrained partitioned edge-streaming model,which utilizes the on-chip memory to reduces the performance impact caused by the random access of the vertices,while streaming edge data and update data from external memory to improve memory performance.Also,it reduces the load balancing problem of on-chip PEs(Processing Engines)by reordering the graph partitions.For architecture optimization of the sparse data computation,we propose a two-level shuffle network to reduce the size of the intermediate result buffer,and use graph compression to reduce the amount of data in external memory accesses.We further designed and implemented a energyefficient large-scale data analysis system based on reconfigurable computing.Compared with the state-of-the-art FPGA implementations,our system is 1.18 times faster while using less onchip resources and lower external memory bandwidth,and achieves 3.62 times improvement in terms of performance to bandwidth ratio.For high concurrent data-interactive computation scenario,we take the web-based cloud computing as an example to discuss the architecture design and optimization of energy-efficient reconfigurable system for high concurrent data-interactive computation.First,we propose a performance analysis model for FPGA accelerated architecture.According to the model analysis,we conclude that SOPC architecture with tight coupling of network I/O processing and application data processing can achieve higher system performance than traditional master-slave reconfigurable architecture in dealing with high concurrent data-interactive computation.Then,for architecture optimization,we design and implement a massive connection optimized TCP/IP offload engine,which can support up to 100 K TCP sessions and near 10 Gbps packet processing rate.We further propose a dynamic online scheduling mechanism for SOPC based cloud computing system to reduce the runtime power consumption.Finally,we validate our design using two typical cloud computing applications: file server and ECG processing cloud on the mimic computer platform.The experimental results show that our system can support more concurrent TCP links and lower data processing delay compared with the commercial server and the master-slave based reconfigurable system in the file server application.In ECG processing cloud application,our system has 38 times performance improvement and 418 times energy efficiency improvement compared to a general purpose cloud system.Finally,we summarize the software and hardware collaborative optimization strategy for data-intensive computing.First of all,we use the logP model to analyze and reveal the nature of application characteristics and architectural characteristics in data-intensive computing.We further define the analytic expression of the system efficiency,according to which we analyze the bottlenecks and optimization direction for data-intensive computing.Then,we summarize and propose a software and hardware co-optimization strategy for data-intensive computing.In the aspect of the algorithm optimization,we summarize the optimization strategies including the using data partition to improve data locality optimization;optimizing data structure to reduce the number of memory access;optimizing data storage and access pattern to utilize the storage structure;optimizing task scheduling to reduce the synchronization overhead.In the aspect of architectural optimization,we summarize the optimization strategies including the tightlycoupling structure;macro pipelining structure;overlapping of the computation and I/O;balance of the computation and I/O;transfer computation performance of communication performance;rational use of a variety of memory devices.We verify the effectiveness of the above optimization methods based on the implementation and experimental results of the two energy-efficiency reconfigurable systems.These software-hardware co-optimization strategies can provide guidance for system design and optimization of other data-intensive computing applications with similar characteristics.
Keywords/Search Tags:FPGA, memory wall, data intensive computing, graph analysis, ECG
PDF Full Text Request
Related items