Font Size: a A A

FPGA-based Human Action Recognition Algorithm Acceleration And Implementation

Posted on:2022-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WuFull Text:PDF
GTID:2518306758966149Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of deep learning algorithms,Convolution Neural Network(CNN)has been widely used in human activity recognition based on wearable sensors,but convolutional neural network is a computationally intensive algorithm.Most of them are deployed on the GPU or CPU platform.Although GPU can realize real-time processing,the hardware deployment cost is high and the computing power consumption is high.It is difficult to meet the application requirements in the embedded field with limited resources and power consumption.High,it is difficult to meet the real-time requirements.Therefore,it has practical application value to develop a set of human behavior recognition system with high precision,high speed and low power consumption.Based on the analysis of the convolutional neural network computing model and the working mechanism of attention,this paper proposes a hardware acceleration scheme of ZYNQ-based human behavior recognition algorithm by using the software-hardware collaborative design method.The scheme is based on the ARM+FPGA heterogeneous system.It is implemented and deployed on Xilinx's Ultra96?V2 embedded development platform.The main contents of this paper are as follows:(1)Design a human activity recognition model based on convolutional neural network and attention mechanism,which is not only suitable for traditional strong label data sets,but also can accurately identify and locate actions in weak label data.(2)Based on the design idea of software and hardware cooperation,the system computing performance and hardware resource consumption are considered comprehensively,and the software and hardware functions are divided.The FPGA is responsible for the implementation of computing modules such as convolution,pooling,dot product,and weighting,and the ARM is responsible for tasks such as full-connection layer operations,Softmax operations,data reading,and hardware control.(3)Based on high-level synthesis technology,design operation modules such as convolution,maximum pooling,dot product and weighting,and perform data transmission between PS/PL through DMA+AXI4-Stream,using data fixed-point,operation parameters Decomposition and loop optimization methods are used to accelerate the calculation of each computing module.In addition,the inter-module and three-layer dot product operation modules of CNN improve the overall computing efficiency of the system through the methods of interlayer pipeline optimization and multi-task parallel optimization.(4)Based on the PYNQ framework,the design of the driver and the host computer is completed,and the function and performance of the system are tested.The results show that the system can effectively identify and locate weak label sensor data.When the operating frequency is 150 MHz,the calculation speed reaches 25.40 frames/s,which is more than 39.2 times faster than the ARM of the Ultra96?V2 embedded platform.Effect;the average power consumption is 2.898 W,which meets the design requirements of low latency and low power consumption.
Keywords/Search Tags:human activity recognition, convolutional neural network, attention mechanism, high level synthesis, FPGA, PYNQ
PDF Full Text Request
Related items