Font Size: a A A

Deep Learning Accelerator Design And Implementation Based On FPGA

Posted on:2017-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q YuFull Text:PDF
GTID:2308330485951832Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In recent years, with the development of computing power and scientific theories, machine learning began to emerge in public life, and the benefits of machine learning applications were accepted by people gradually. Whether the taobao items recommendation system, driverless car, or the man-machine Go competition AlphaGo, machine learning showed us the amazing power of science technology and improved our daily life. As the emerging field of machine learning, Deep Learning originates in the further study of artificial neural networks, and is organic combination of biological sciences and computer science. It shows excellent ability in solving complex learning problems and is seeing significant attention from industry.However, in order to solve the more abstract and more complicated machine learning problems, the networks becomes increasingly large scale; for example, the Google cat system has one billion neurons connections. So high performance implementations of deep learning networks immediately become one of the research hotspots.As a common means to accelerate algorithms, FPGA, Field programmable logic gate array, has high performance, low power consumption, programmable, small size and other characteristics. In this paper, we use FPGA to design a pipelined accelerator for Deep Learning common computing. Main work includes:1)、This paper analyzes the prediction process and the training process of Deep Neural Networks and Convolutional Neural Networks, and gets the common computational primitives and characteristics to design the accelerator. The algorithms include feedforward algorithm, local pre-training algorithm and global training algorithm.2)、Based on the resources and memory width, the paper designs the proccesing element, including the feedforward module and weight update module. The module is configurable for different sizes of neural network and is pipeline design for high throughtput.3)、Analysizing the superstructure and data access of the FPGA based accelerator, and designing the hardware drives of linux operating system and application programming interface of user.4)、Summarizing the factors which affect the performance and energy consumption of the FPGA accelerator through making a large number of comparative experiments. The paper uses several datasets to test the performance, power, energy consumption with CPU and GPU, and analyzes the advantages and disadvantages of the FPGA accelerator.
Keywords/Search Tags:Deep Learning, artificial neural network, FPGA, prediction, training, accelerator, low power
PDF Full Text Request
Related items