| With the continuous development of artificial intelligence theory and technology,deep neural network model has made important progress in image processing,target detection,speech recognition,semantic segmentation and so on,and has been widely used in real life.So far,most of the deep learning methods and deep neural network models are calculated by GPU.It has the characteristics of high parallelism,which can solve the problem of complex computing in deep learning.However,due to its high-power consumption,it is difficult to apply deep learning in embedded devices,small and mobile devices with limited scale and power consumption.In recent years,with the development of electronic components technology,the storage and computing capacity of field programmable gate array(FPGA)has been improved rapidly.It has been able to complete the relevant calculation process of deep neural network.Compared with GPU,FPGA has lower power consumption and higher energy efficiency.Therefore,in practical application,the research on the combination of deep learning and FPGA heterogeneous computing is gradually carried out,and has gradually become an important technical development direction.Firstly,the thesis investigates the foundation of deep neural network and compression related technology.The development of heterogeneous platform and the advantages of different devices are studied.The application and development of the deep neural network model on heterogeneous platform are studied.Then,aiming at the defects of traditional model compression methods,a new pruning method based on weight is proposed.This new pruning method calculates the importance of each layer and sets different pruning rates for each layer.So as to achieve the purpose of weighted pruning.The pruning process adopts the pruning method based on the geometric median,which avoids the dependence on the weight parameters in the pruning process.Experiments show that the new method has good migration in different models.Finally,the acceleration method of FPGA heterogeneous platform is studied.According to the structure and operation characteristics of deep neural network model,a large number of convolution calculation is carried out by FPGA,and the overall architecture and calculation process are designed.And a new deep acceleration optimization method is proposed,which makes full use of hardware resources through storage optimization,data flow optimization and deep computing optimization.Finally,the effectiveness of the deep acceleration optimization scheme is verified.The experimental results show that the new weighted pruning method and depth acceleration method are an important advantage for deep neural network application in power consumption and scale limited scenarios,and are of great significance for the deployment and application of deep neural network in practical engineering. |