Font Size: a A A

Design Optimization And Physical Implementation Of AI Processor Based On Multi Weight And Multi Thread Execution Model

Posted on:2022-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:W X ChenFull Text:PDF
GTID:2518306605469944Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In the 21st century,thanks to the continuous development of science and technology and advanced technology,the cost of intelligent chips is getting lower and lower,more and more people can afford to use intelligent devices,and the original lack of raw data is no longer or lack,so new application scenarios emerge as the times require.New application scenarios require higher accuracy of neural network reasoning,and the network model inside the chip becomes more and more complex.The fundamental reason is that the amount of characteristic parameters and calculation increases exponentially.However,based on the big data background of Internet of things,the existing artificial intelligence chips are difficult to be implemented in real-time application scenarios such as smart home and vehicle.The new practical use scenarios bring new bottleneck problems to the chip development,which are mainly the low energy efficiency ratio and low resource utilization.In order to solve this bottleneck,a wide variety of ASIC has been proposed by academia and industry in recent years.However,the existing architecture improves the computing power by increasing the operating frequency and computing storage unit array.It has been faced with such problems as low utilization of computing unit,high implementation cost,limited communication bandwidth,poor scalability and high power consumption.To solve the above problems,the research team put forward the intelligent computing architecture of MWMT execution model in the preliminary work.In this paper,the ASIC chip of customized mwmt execution model is optimized at different levels.First of all,based on the characteristics of weight and data repeated scheduling in convolutional neural network,we propose a structure of guided subordinate cooperative array,which focuses on improving the control signal time-sharing transfer multiplexing in the calculation process,and also improves the reusability of data and weight,reducing the data scheduling.On the premise of ensuring the function of the chip is not reduced,the area is reduced by 7%,which reduces the cost of the chip.In addition,we study the data scheduling between computing kernel and external storage devices,and find that due to the limitation of special interface protocol,the control of data interaction is redundant and complex.After investigation,aiming at the customized chip in this paper,we designed DMA module,which is used for the kernel computing unit to directly access the external DDR storage,and changed Xilinx's special interface protocol to AXI protocol,which is more widely used in the industry,The efficiency of data interaction is improved,and the overall system delay is reduced by 8%.Finally,we analyze the requirements of convolution,pooling,all connected data scheduling in different neural networks,and compatibility with external storage DDR interface protocol bandwidth.Combined with the design of batch processing,we innovatively propose the design of multi-mode inference.For the ASIC chip in this paper,combined with reorganization batch processing,the original bandwidth utilization is increased from 18.75% to 93.75%,The efficiency of the system is further improved.The work of this paper starts from the team's previous mwmt execution model ASIC chip,covering the analysis of data flow,the research and analysis of neural network structure,the utilization of bus bandwidth,the code implementation of optimized structure,the optimization of back-end synthesis,and the verification of FPGA.The purpose of this paper is to alleviate the bottleneck of special AI chip,such as limited bandwidth,high cost,high power consumption and poor scalability.This paper innovatively proposes the architecture of guided subordinate cooperative computing array,analyzes the structure of neural network,and innovatively proposes the multi-mode design,which greatly improves the efficiency of the system,and has important reference value for the future design of artificial intelligence field,algorithm,architecture,etc.
Keywords/Search Tags:MWMT, DMA, AXI, Multimodal Inference
PDF Full Text Request
Related items