Font Size: a A A

Research On Parallel Optimization Technology For Accelerating Data-intensive Algorithms

Posted on:2020-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhouFull Text:PDF
GTID:2518306548995919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Big data and cloud computing are playing important roles in current computer systems.Massive data processing tasks require higher performance and higher efficiency.Data-intensive algorithms are used for processing big data.These algorithms usually spend much time on I/O memory access and data manipulation.Most studies are focusing on improving the efficiency of data-intensive algorithms on different platforms in order to achieve higher throughput and lower latency when processing massive data.Parallel optimization technologies for data-intensive algorithms mainly aim at optimizing memory access and achieving higher parallelism.Moreover,it is also essential to implement these algorithms on different platforms and different kinds of applications as efficient as possible.This paper introduces optimization methods for data-intensive algorithms which are typical algorithms in fields of signal processing,image processing and intelligent applications respectively.Parallel optimization technologies for typical scenarios with different dimensions of data are investigated.Decoding algorithm of polar codes is considered as a representative data-intensive algorithm for signal processing.We implemented a parallel GPU-based polar decoder.Optimizing schemes of memory allocation and thread management when processing tasks with complicated control flows on GPUs are introduced.In additional,we reduced the latency caused by irregular memory access and recursive calculations when decoding.When dealing with images which are three-dimensional data,the typical data-intensive algorithm convolution algorithm is optimized to implement convolutional neural networks on ARM-based processors.We built an instruction pipeline model for the ARM-based CPU in order to schedule SIMD instructions efficiently and reduce stalls in the pipeline.A tiling method is exploited to increase data reuse and decrease cache misses when calculating.As a typical intelligent application,object dectection applications based on deep neural networks require higher throughput for the purpose of fast detecting on remote sensing images.Faster R-CNN detecting network is paralleled on the GPU in this paper.Multiple images can be detected simultaneously and faster detecting speed is achieved.
Keywords/Search Tags:Data-intensive algorithm, Parallel optimization, SIMD, GPU, Polar code, Embedded processor, Deep neural network
PDF Full Text Request
Related items