Font Size: a A A

Research On Scalable Accelerator Design For Face Detection And Recognition Application

Posted on:2020-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q FuFull Text:PDF
GTID:2518306548991119Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Face detection and face recognition technology have been widely used.With the development of deep learning,the face detection and face recognition technology based on deep learning has surpassed the human eye recognition level,but it brings a sharp increase in the amount of calculation.Faced with many face detection and face recognition application scenarios,how to accelerate their inference performance has become an urgent problem to be solved.Based on FPGA platform,this paper studies the deep learning based face detection and recognition forward inference parallelization technologyThis paper first studies the process and characteristics of face key point detection algorithm and face recognition algorithm based on deep learning.A face recognition example is taken for specific research,and a fast algorithm suitable for hardware implementation is designed for face alignment.Then we study the quantization of the face detection algorithm,we use a low-bitwidth global quantization method,which reduces the bandwidth occupation by 50% with little affecting on the accuracy.In this paper,the general matrix multiplier accelerator is selected to accelerate the face-related application.This paper improves accelerator structure and models the performance of this hardware structure.The accelerator parallel search algorithm is designed to adjust the accelerator structure according to hardware resource conditions and different convolutional neural network structures to optimize the theoretical performance of the accelerator.Finally,the accelerator design was implemented on the FPGA of the Zynq7020 chip.The experimental results show that the accelerator can achieve a throughput of 35 GOPS on the platform.Compared with the CPU platform and GPU platform,the performance-topower consumption ratio is 15 and 5 times that of the former.
Keywords/Search Tags:Deep learning, Face detection, Face recognition, FPGA platform, Performance model, Accelerator
PDF Full Text Request
Related items