Font Size: a A A

Matrix Vector Multiplication Component Design For Deep Learning

Posted on:2020-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2428330602951382Subject:Engineering
Abstract/Summary:PDF Full Text Request
The deep learning algorithm based on neural network is one of the fastest growing computing fields in the past few years.From image/video/audio recognition to automatic translation,business analysis and autonomous driving,many emerging high performance and embedded applications rely on deep learning algorithms.Deep learning models often contain a huge amount of computation,while deep learning algorithms currently run on general-purpose CPUs and GPUs,and their throughput and energy efficiency are relatively low.In recent years,academia and industry have proposed a variety of novel processor structures for deep learning.These accelerators have almost all large-scale matrix multiply-add components as the core,and fully consider the computational characteristics and data of deep neural networks.The transmission characteristics are designed to meet the storage hierarchy and interconnection structure of its calculation mode.As a result,great improvements have been made in both throughput and power consumption.The matrix vector multiplication component studied in this paper is the core functional component of the deep learning accelerator,which occupies most of the area of ??the deep learning accelerator and has a major contribution to the throughput of the accelerator.Therefore,the design and optimization of matrix vector multiplication components play a key role in the implementation of deep learning accelerators.The specific research work is as follows:The logical design of the matrix vector multiplication component.According to the top level requirements to complete the instruction decoding settings,this design supports three matrix vector operation instructions.The multiplier and the adder array are the main components of the matrix vector multiplication.According to the characteristics of different algorithms,the appropriate implementation method is selected.The multiplier consists of a base 4 Booth encoding,a compression tree,and a parallel prefix adder.The structure of the compression tree and the parallel prefix adder is optimized to improve the performance of the multiplier.For the additive array portion,the design uses a combination of a 4-2 compression tree and a parallel prefix adder to perform 32 16-bit signed number summations.Pipeline vector multiplication components for pipeline design,functional verification,and logic synthesis.Due to the large delay of matrix vector multiplication,it is streamlined.After careful logical division,the operation process is divided into 5 stations,and the logical delay of each station is well balanced.For the completion,I have verified the design.The verification method is that another algorithm is used to complete the multiplier and 32 16-bit signed number summation.For the two implementation methods,the same data is input,and the result is compared,thereby completing the functional verification of the design.The input data is generated as a random number.The integrated output netlist is used for subsequent physical implementation.The physical implementation of matrix vector multiplication components.Adopt a hierarchical physical design approach.Through the structural analysis of the component,a reasonable sub-module partitioning method and reasonable sub-module curing size are determined.At the top level,macro module layout is optimized and buffer insertion is optimized for long line delay reduction.Through the above method,the layout and routing of the component is completed,and then the timing analysis and physical verification are performed.The depth learning matrix vector component has an area of 1300 mm x 3600 mm and an operating frequency of 1.1 GHz,power consumption is 1.3W.
Keywords/Search Tags:Multiplier, Adder, Base 4Booth Coding, Parallel Prefix Adder, Matrix Vector Multiplication, Deep Learning
PDF Full Text Request
Related items