Font Size: a A A

Construction And Optimization Of Deep Learning Operator Library Based On Loongson Platform

Posted on:2022-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:N YinFull Text:PDF
GTID:2518306542461974Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
General purpose processor is the basic component of the information industry,and also an important part of the localization road.At present,China's self-developed general-purpose processor such as Loongson has become mature,but its software ecology needs to be improved.Many software architectures,such as artificial intelligence related machine learning framework and deep learning operator library,are not supported and applied on Loongson platform.In the current stage of rapid development of artificial intelligence technology,many companies engaged in artificial intelligence research and development are emerging.Among them,there are also partners of Loongson,such as IFLYTEK CO.LTD.Loongson,which is at the forefront of the artificial intelligence industry chain,is duty bound to adapt machine learning framework and deep learning operator library on its platform.Not only to support the framework and operator library on Loongson platform,but also to make it easy to use on Loongson platform,so the core of this thesis can be divided into two parts: Construction and optimization.The main work of this thesis can be divided into the following two points:(1)Machine learning framework and deep learning operator library are constructed for the first time on Loongson platform,and the development framework of artificial intelligence on Loongson platform is realized from scratch.According to the most popular machine learning framework Tenso Flow and the commonly used deep learning operator library oneDNN,the porting is implemented on Loongson platform,and the function can be used normally.(2)In this thesis,the method of embedding OpenBLAS sub module is used to optimize the deep learning operator library.A high-performance open source project of general matrix multiplication is embedded into the deep learning operator library as a sub module,so that the original general matrix multiplication processing in the deep learning operator library can be transferred to the interface function of general matrix multiplication,which can play an important role in the optimization of matrix multiplication and convolution operators After optimization,the performance of matrix multiplication operator has been greatly improved.In addition,this thesis also uses SIMD instruction to preliminarily optimize the pooling,softmax and other operators.In the original loop,only one pair of variables can be processed at a time.After SIMD instruction optimization,the loop can process four pairs of variables at a time,which greatly improves the efficiency of loop processing and the performance of related functions.
Keywords/Search Tags:Loongson Platform, Basic Linear Algebra Subroutine, Deep Learning Operator Library, Machine Learning Framework, SIMD Instruction Set
PDF Full Text Request
Related items