Font Size: a A A

VLSI Architecture Design And System Implementation Of Convolutional Neural Network Accelerator

Posted on:2024-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:M ZouFull Text:PDF
GTID:2568307106995969Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Convolutional Neural Networks(CNNs)are abstract,mimetic techniques based on the nervous system of the animal brain.It is capable of unprecedented accuracy and efficiency in tasks such as target recognition,detection and scene understanding.With the pursuit of network processing speed and accuracy,the hierarchy of CNNs is becoming more complex and the complexity of operations is increasing.Also,there are increasing demands on the hardware platform on which the network is run.A series of hardware acceleration platforms for CNNs have been created to address the challenges of high computing volume,high broadband requirements,and high power consumption of neural network server hardware platforms in the current end-use environment.Among the various hardware acceleration platforms,Application Specific Integrated Circuit(ASIC)allows efficient deployment of hardware architectures.It is possible to minimize power consumption and further improve Performance per Wat.Based on this,this thesis investigates a Convolutional Neural Network Accelerator(CNNA)based on Very Large Scale Integrated Circuit(VLSI)architecture.In this thesis,the main focus of the research is on optimizing the circuit structure of the CNNA to control the circuit area and improve performance while effectively reducing the total power consumption metric.To this end,the main research elements and results of this thesis are as follows:(1)Design and implement a hardware accelerator for CNNs with VLSI architecture.A hardware accelerator for CNNs based on the VLSI architecture is proposed by analyzing the inference computation process,input and output characteristics,and parallel processing feasibility of each layer of CNNs.The accelerator supports regular convolution computations,maximum pooling computations,fully connected computations and non-linear function activation computations.In addition,it adds data cache space and a dedicated 3Dimensional Direct Memory Access(3D-DMA)channel to improve the efficiency of data access between the accelerator and the external world and reduce the power consumption of data access.(2)Propose and implement an efficient data access method.This approach uses a sliding window fetching method to achieve efficient storage and data access,reducing the repeated data access during convolution,pooling computations,improving the efficiency of data access,and reducing the power consumption of the computing unit during data access.(3)Construct a proprietary verification platform for CNNA based on the Universal Verification Methodology(UVM).The UVM verification platform can complete CNNA system’s functional and collect coverage rates.The results show that the CNNA can achieve all functional requirements,with 100% functional coverage of each module and code coverage of over 99.7%.The total area is 458909.45 um2 and the total power consumption is 59.86 m W at an operational clock frequency of 400 MHz and a voltage of 1 V,with no timing violations in the logic synthesis.The accelerator designed in this thesis takes 1.68 s to compute 300 RGB images of 1080×720 which is 4.27% of the pure on-board ARM Cortex-A53.The total power consumption is 4.23% of the Ge Force RTX 2080 and66.03% of the ZCU104.
Keywords/Search Tags:Convolutional Neural Network, Hardware Acceleration, VLSI, Efficient Data Access
PDF Full Text Request
Related items