Font Size: a A A

Design And Implementation Of Neural Network Computing Framework For ZYNQ SoC Embedded Platform

Posted on:2022-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2518306491453534Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of deep learning field and the rapid popularity of intelligent edge devices,there is an increasing demand for implementing deep learning algorithms on mobile embedded devices.The research results of neural networks continue to land in actual production,life and services,and are gradually promoted.Introducing neural network technology into embedded platform with limited resources has become the main trend of deep learning technology progress and development in the future.However,due to the pursuit of higher precision,the existing neural network computing framework is accompanied by more and more complex operations with richer functions.The large memory requirements and computing power requirements make the deployment of neural network computing framework on the embedded platform with limited resources and computing power become difficult.As a result,the neural network technology is difficult to adapt to the needs of industrial applications,and its development in the embedded field is restricted to a certain extent.Therefore,building an embedded neural network computing framework with excellent performance and low power consumption is the key to promote the application of neural network technology in embedded platform.In this paper,ZYNQ SoC is used as the development platform.To construct a complete neural network application system,the Linux operating system is transplanted to the ZYNQ SoC embedded platform first,then the image acquisition input module is designed,so as to achieve the purpose of flexible development and experimental verification in real application scenarios.Secondly,in view of the existing neural network computing framework is complex and takes up a lot of storage resources,this paper expands and improves the open source neural network framework Darknet,and introduces the deepthwise separable convolution algorithm and the lightweight neural network model Mobile Net V2.Finally,aiming at the problem of large computation and resource consumption of GEMM algorithm in the framework,ARM NEON assembly instruction and Open BLAS were used to optimize the algorithm,and an embedded neural network computing framework oriented to ZYNQ SoC was obtained.The contributions and innovations of this paper are as follows:1)In order to apply the framework to actual scene,this paper designs and implements the image acquisition input unit.To solve the difficulty of ZYNQ SoC baremetal environment development problem,this paper adopts the method of operating system oriented development.The U-BOOT,kernel compilation and ramdisk root file system transplantation are realized for the Linux operating system.A complete system for processing and categorizing digital images with a neural network computing framework is established.Through experiments,it can be concluded that the pictures collected by this module can be provided to the neural network for processing,and the classification and recognition are carried out in the improved framework of this paper,and maintaining the original recognition accuracy.2)To tackle the problem of limited resources and low computing power of the embedded platform,this paper improves the lightweight open source neural network framework Darknet,introduces the deep separable convolution algorithm on the basis of Darknet,and realizes the efficient lightweight neural network model Mobile Net V2 combined with this algorithm.It greatly reduces the complexity and number of parameters of convolution computation,and reduces storage resources and computing force requirements as much as possible while maintaining accuracy.Through the comparison of image classification experiments,we can see that the Mobile Net V2 network model has a good ability of image classification and occupies less resources with high computational efficiency.3)In order to solve the problem of large amount of convolution calculation and high resource consumption,combined with the advantages of the ARM Cortex-a9 dualcore of the ZYNQ SoC,this paper optimized the core algorithm of convolution computation,GEMM matrix multiplication operation,carried out assembly instruction optimization of ARM NEON,realized data reading,matrix multiplication design and other basic operations.Finally,the optimization of GEMM using Open BLAS function library is realized.Through the test,it can be seen that both of these two methods can effectively improve the computational efficiency of convolutional computation,and can better improve the real-time performance of embedded platform neural network computation.By using the improved neural network computing framework in this paper on the ZYNQ SoC development board to conduct image classification experiments,the computing rate has been greatly improved,which ensures a certain recognition accuracy and effectively reduces the storage overhead and computing force requirements,basically meeting the practical application requirements.
Keywords/Search Tags:ZYNQ SoC, Neural Network Framework, Convolutional Computing Optimization, ARM NEON, OpenBLAS
PDF Full Text Request
Related items