Font Size: a A A

Parallel Sclera Vein Recognition And LDPC Decoder On GPU

Posted on:2015-03-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LinFull Text:PDF
GTID:1108330464968896Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The error correction ability of LDPC(Low Density Parity C heck, LDPC) can reach Shannon limit. But it will take too many time to be decoded since of its computationally- intensive decoding algorithm. Sclera recognition, which is an emerging biometric identification technology, can get better identification result than iris recognition under visible light conditions. However it can not be used in real-time applications because of high compute density in matching phase. LDPC decoding and sclera recognition belong to Irregular Problem on Massive Datasets(IPMD). IPMD needs repeatedly computing on different data sets, and relationship between the index of data elements and loop counter is non-linear inside a data set.Although the computation of IPMD can be accelerated by using GPU(Graphics Processing Unit), there are still some challenges in parallel algorithm design. These challenges are as follows: First, due to poor spatial locality, data in a set is difficult to be divided into separate sub-block; Secondly, it is not easy to find the optimal mapping strategy between GPU resources and subtasks or combinations of subtasks; Thirdly, the irregular data addresses also led to coalesced memory access impossible. We first study the performance analysis model of GPU program, then propose three solutions for these challenges. Proposed solutions and methods are also applied to LDPC decoding and sclera recognition.The main contributions of our research work are as follows:1. We analys the internel parallel working among three main components(CUDA core, SFU and LD / ST) of GPU and the pipeline model inside each component. DAG are intruduced to represent instruction parallelism. We design a multi-part pipeline model to calculate clock cycles needed to run the applications. The model is optimized by nine factors such as active warps, divergence, synchronic and so on. Appling analysis model to LDPC decoding algorithm, we draw a conclusion that the SPA is the best LDPC decoding algorithm for GPU.2. We propose a approach for parallel programing irregular problem on massive database. Firt, multiple datasets should be concurrently processed. Second, the task of a dataset shoud be restricted in one block. At last, synchronous instructions should be inserted into correlation instructions. The code block devided by synchronous instructions should be equally splited. We also point out the requirement of this approach. Using the sclera matching algorithm, we study how to reduce the data size( by WPL descriptor) and computing time(Y descriptor)of a data set.3.We combine the partitioning, communication, agglomerat ion and mapping to a new GPU program design model which includes task balance model, synchronous model and coalescing memory access model. We apply these models on different phase of sclera matching.4. We propose two approaches to reduce the latency of global memory access. O ne of them is to compress the data by coding a long bit width into a short one; the other is to satisfy the coalesced memory access requirement by map a set of data with same size as the warp to a warp. We also apply these two approaches to GPU based LDPC decoder.Applying these approaches to LDPC decoding and sclera matching, sclera matching rate is raised from 2 pairs per second to 1,083. This makes sclera recognition possible in real-time applications. The LDPC decoding throughput on GPU can achieve up to 550 Mbps, which is currently the fastest LDPC decoder on one GPU.
Keywords/Search Tags:Parallel computing, Graphics Processing Unit(GPU), Sclera recognition, Lower Density Parity Check(LDPC) code, Irregular problem
PDF Full Text Request
Related items