Font Size: a A A

Research Of Compilation Optimization Technology Based On CNN Algorithm Behavior Analysis For Reconfigurable Architecture

Posted on:2022-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:J Q ShiFull Text:PDF
GTID:2518306554450454Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of artificial intelligence technology has played a significant role in promoting the transformation of China's economy to high-end manufacturing and high-tech fields.However,typical algorithms in artificial intelligence such as convolutional neural networks(CNN)are more and more computing-intensive and memory-intensive,which brings great challenges to the processing capacity of computing chips.Reconfigurable architecture has high flexibility of general purpose processor and high energy efficiency of special hardware,which becomes an effective means to deal with high intensive application processing.However,reconfigurable processor is still faced with "programming wall"problems included difficulty of programming and low resource utilization.Therefore,by the behavior analysis of CNN algorithm on reconfigurable architecture,this thesis proposed a domain specific efficient compilation optimization method for reconfigurable architecture.(1)In order to mine the parallel information in CNN algorithm,the behavior pattern of CNN algorithm based on reconfigurable structure is deeply studied and analyzed.Firstly,low level virtual machine(LLVM)compiler was used to analyze CNN algorithm structure information.Then,by the algorithm structure information,a polyhedron model was constructed to describe the software features of CNN algorithm.Finally,combined with the hardware characteristics of reconfigurable architecture,the feature vectors of CNN algorithm under reconfigurable architecture were constructed.The experimental results show that on 4 and 16 processing elements(PE),the CNN algorithm based on behavior analysis can achieve 99.31%and 99.61%of the traditional partition speedup,and the behavior pattern analysis based on CNN algorithm can effectively mine its parallel information.(2)In view of the need to use expert experience and partition strategy to deal with a single problem,a thread division method based on parallel knowledge is proposed.First of all,the feature vectors of CNN algorithm were used to construct the sample set of the program for the best program that was optimized by experts.Then,the parallel knowledge contained in the support vector machines was studied,including whether the program can be parallel,the number of PE that supports the partition,the mapping range of PE under the maximum partition number,etc.Finally,using parallel knowledge guided the parallel division of CNN algorithm on reconfigurable structure.The experimental results show that,on 4 and 16 PE,the average acceleration ratio of CNN algorithm is 1.27 and 4.65 respectively,and the parallel knowledge based thread division method can divide CNN algorithm in parallel.(3)In the process of thread partition based on parallel knowledge,the data locality is not fully considered,which leads to low parallel efficiency.A thread combination optimization method based on fuzzy clustering was proposed.Firstly,the algorithm mapping scheme based on reconfigurable architecture was used to construct the performance evaluation function of combinatorial optimization.Then,the K-means clustering method was used,and the performance evaluation function was used as the clustering condition to cluster the existing threads according to the hardware resources.Finally,we analyzed the data dependence between the clustered threads,adjust the mapping range of threads,and reduce the number of data movement between PE.The experimental results show that,compared with the thread partition method,the speedup of the thread combination optimization method is increased by 35.25%and 35.62%respectively on 4 and 16 PE.The thread combination optimization method based on fuzzy clustering can improve the parallel efficiency of CNN algorithm.(4)Aiming at the programming difficulty of reconfigurable architecture processor,an automatic compiling method was designed and implemented.Firstly,PEs were divided into data extraction,data distribution,data summary and data processing.Data extraction,distribution and summary were completed by specific PEs.The data processing was realized by the mapping rules of calculation mode assembly instruction group.Then,the calculation process of CNN algorithm was analyzed and the calculation mode was extracted.Finally,according to the calculation mode,an efficient assembly instruction set was designed to realize automatic compilation.The experimental results show that the automatic compilation method can achieve 52.32%of the execution time of CNN algorithm under manual partition.The automatic compilation method can realize the conversion from high-level language to assembly instructions,and reduce the programming difficulty of CNN algorithm under reconfiguration.In the LLVM 9.0 compiler and the reconfigurable array processor based on BEE4 development platform,this thesis implemented the proposed compilation optimization method.The experimental results show that the reconfigurable architecture oriented domain specific efficient compilation optimization method can transform high-level language into assembly instructions supported by reconfigurable array processor.Compared with LLVM and OpenMP parallel programming tools,the speedup scores of processing Lnet-5 network on 16 PEs are improved by 2.84%and 3.63%,respectively.To sum up,this thesis proposes a reconfigurable architecture oriented domain specific efficient compiler optimization method,which provides a new channel for exploring the efficient and automatic speculative parallel technology of compiler system and reconfigurable architecture,and provides a new idea for improving the parallelism of high-density applications in reconfigurable architecture.
Keywords/Search Tags:Reconfigurable architecture, Convolutional neural network, Behavior analysis, Compiler optimization, parallelization
PDF Full Text Request
Related items