Implementation And Application Of Top-K Algorithm Based On Deep Learning Processor

Posted on:2021-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:L L Niu

Full Text:PDF

GTID:2428330632453253

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Nowadays,the accumulation of massive amounts of data has put tremendous pressure on data storage and data analysis.However,with the continuous accumulation of massive amounts of raw data,the corresponding value information has not been directly revealed.Top-K query can realize the screening of massive data,but the traditional Top-K query algorithm has poor performance in processing massive data,and can no longer meet people's performance requirements.With the development of deep processor technology,processor performance continues to improve,providing powerful computing power support for deep learning model training and inference,and this feature is the key to improving the performance of Top-K queries.This article analyzes the DianNao series processors as the research background,and combines the hardware characteristics of the deep learning processor to design and implement the Top-K query algorithm based on the deep learning processor.The main research contents of this paper are:(1)Design and implement a single-core Top-K query algorithm based on deep learning processors.The bottom layer of DianNao series processors supports commonly used tensor instructions in deep learning,which can complete batch data comparison within one clock cycle,which greatly improves The data processing parallelism is improved.According to the characteristics of the maxpool instruction and minpool instruction in the deep learning processor,a single-core Top-K implementation based on the deep learning processor is designed and implemented.By comparing with the CPU performance,it is found that in the medium and large scale,the Top-K performance based on the deep learning processor is about 20 times that of the CPU.(2)Design and implement a multi-core Top-K query algorithm based on a deep learning processor.Using vector instructions to filter data can greatly improve the speed of data filtering.In order to give full play to the advantages of the deep learning processor,Top-K has been optimized for instruction-level parallelism,which increases the performance of single-core Top-K data query to about 1.6 times that of the original,and the performance of multi-core Top-K to 1.2 times that of the original.Performing Top1000 queries on a data scale of 1 million,the optimized multi-core Top-K performance is about 36 times that of the CPU and about 2 times that of the GPU,It is proved that the performance improvement of Top-k query based on deep learning processor is feasible.(3)Expand the tensorflow framework and apply Top-K to the Faster-RCNN network.First complete the registration and packaging of the Top-K operator.Replace the original CPU implementation,combine the characteristics of the Faster-RCNN offline model for deep learning processors for model processing,then run the Faster-RCNN network.Comparing the accuracy of the Faster-RCNN model before and after operator replacement,I found that the network can still run correctly after replacement,which verifies the availability of Top-K operator based on deep learning processor.

Keywords/Search Tags:

Top-K query, multi-core, deep learning processor, performance optimization

PDF Full Text Request

Related items

1	Software Optimization And Hardware Accelerator Design On NoC-based Multi-core Processor
2	Research And Design Of Multi-core Processor System Based On FPGA
3	Research Of High-performance, Low-power Multi-core Processors
4	Research On Acceleration Technology For Deep Learning Inference Based On Multi-core And Many-core Platforms
5	Performance Analysis And Optimization Of Genetic Algorithms On Multi-core Systems
6	Research On High Performance Optimization Algorithms Of Database Query Based On CMP
7	The Research On Mechanisms Of Optimizing Memory Access In Multi/Many-Core Architecture
8	Research On Collaborative Optimization For Multi-core Processor Resources
9	Research Of Parallel 3D-FFT On Multi-core Processor
10	The Multi-core Query Optimization Of XML Database