Font Size: a A A

Research On Acceleration Technology For Deep Learning Inference Based On Multi-core And Many-core Platforms

Posted on:2020-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:K Q ZhuFull Text:PDF
GTID:2428330611493633Subject:Engineering
Abstract/Summary:PDF Full Text Request
Deep learning technology has subverted traditional methods in many fields such as image target detection and recognition,speech recognition and so on.However,the essence of intelligence is computing,so the application of high throughput deep learning model relies on strong computing power support.Developing and optimizing the computing efficiency of the architecture platform and improving the intelligent processing capability are the most important to promote the application progress.Multi-core and many-core architectures are effective platforms for high throughput machine learning inference applications.In addition to x86 and GPU platforms,there are still many computing technologies for multi-core and many-core architectures to be researched.The domestic Phytium processor can be designed as multi-core(below 64 cores)or manycore(64 cores or more)through hierarchical expansion,assuming the role of multi-core main processors or domain many-core accelerators.The current Phytium 2000+ 64 cores high performance processor can be used for researching the multi/many-core platform intelligent optimization technology.In addition,the VLIW architecture based multi-core DSP chip with high energy efficiency is also a good platform for intelligent computing.This paper aims at high throughput domestic autonomous Phytium processors and low power multi-core DSP chips,researching on their intelligent processing adaptation and algorithm optimization technology,and exploring technical approach for new generation of domestic autonomous intelligent computing system.Firstly,this paper studies the reference optimization technology of frameless deep learning application on Phytium platform.The hardware structure and available parallel resources of Phytium processor are comprehensively analyzed.Learning the characteristics of inference algorithm,several optimization techniques are proposed.Effectiveness of relevant optimization technology was evaluated based on several typical applications.The performance of RNNs-LSTM model based application is 10.2 times after optimization,and the overall performance of the platform has reached 2.9 times of the mainstream high-performance x86 platform.This paper then studies the reference optimization technology of framework-based deep learning application on Phytium platform.Relevant experimental results show that the performance of the framework is greatly improved.Finally,based on the current mainstream multi-core DSP chips,this paper carries out depth learning inference under VLIW architecture.With comprehensive analysis of its hardware structure and parallel resources,this paper put forward many application and platform related optimization technology.Experimental results show that the energy efficiency of the DSP chip is 7.79 times that of the high-performance x86 chip,and 3.56 times that of the embedded ARM chip.
Keywords/Search Tags:Phytium processors, Multi-core VLIW, Deep learning, Parallel acceleration, Framework optimization
PDF Full Text Request
Related items