Font Size: a A A

The Research And Design Of Sparse Arithmetic Intelligent Processing Array

Posted on:2024-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:C L LiFull Text:PDF
GTID:2568307079454374Subject:Information and Communication Engineering
Abstract/Summary:
With the rise of AI applications in many fields,AI algorithms represented by neural networks have evolved towards complex structures and a large number of parameters,which pose larger and more complex performance requirements on hardware platforms.In order to satisfy the performance and functional needs of hardware platforms in application scenarios,two types of emerging technologies,Computing-in-Memory(CIM)and sparse computing,have been widely studied,which optimize the computation process from two perspectives:highly parallel computing and invalid operation skipping.However,existing computational solutions suffer from low generality and poor speed-up,and there are gaps in the research on combining both techniques.In order to solve those problems,this thesis primarily analyzes and compares the advantages and disadvantages of Computing-in-Memory and sparse computing from the perspectives of power consumption and arithmetic power,then designs the optimization strategies for the existing sparse operation sparse computing and CIM operation data up-date problems,finally proposes a hybrid computing architecture based on the task alloca-tion of large and small cores.The main points of the innovation are as follows.(1)A bit-level sparse algorithm is designed to exploit the richer bit-level sparsity to effectively improve the speedup of the sparse operation process,and based on this,sparse hardware supporting fp32 and 0~23 bit-specific operations is designed to compensate for the limited accuracy problem existing in the CIM macros.(2)A ping-pong CIM-macro-based CIM scheme is designed,and the corresponding data mapping and data flow strategies are designed to achieve the hiding of CIM data up-date time through ping-pong,and the designed data flow scheme effectively improves the time and hardware utilization efficiency of computing resources in the computing process through the reconfiguration and utilization of multiple CIM macro operation results.(3)A task allocation scheme for large and small cores is designed to schedule com-putation tasks to CIM performance cores(large cores)or sparse computing efficient cores(small cores)by judging parallelism to turn off unsuitable computation resources and achieve higher computation efficiency.Finally,to validate the effectiveness of our work,a circuit implementation of the target architecture was performed and functional simulations were performed on Vivado.Two sets of hardware ablation experiments are designed to test the change in performance when the sparse computation and ping-pong CIM strategies are enabled and disabled.Experimental results show that the designed sparse computing scheme achieves a perfor-mance improvement of 25%~65%under different neural network sparsity tests,and the ping-pong CIM strategy achieves a computational efficiency increase of 10%~30%in different neural network layer.Finally,the proposed hardware is evaluated based on VGG-16 in resource-power simulation at TSMC 28nm process,with a peak hardware arithmetic power of 819.2 GOPS,an area of 2.1631 mm~2,and an average power con-sumption of 91.6133 m W,achieving a 2.6×energy efficiency improvement over similar CIM designs under normalized comparison.
Keywords/Search Tags:Neural Networks, Computing in Memory, Sparse Computing, Circuit Design
Related items