Font Size: a A A

Implantation And Optimization Of Template Matching Algorithm For Feiteng DSP

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:J T HuFull Text:PDF
GTID:2428330602999576Subject:Electronic and communication engineering
Abstract/Summary:
Feiteng FT-M6678(hereinafter referred to as M6678)DSP(Digital Signal Processor,hereinafter referred to as DSP)is a multi-core high-performance DSP with completely independent intellectual property rights.M6678 adopts the Harvard architecture and the new Key Stone multi-core architecture that store instructions and data separately.Image template matching algorithms play an important role in computer vision,target detection and tracking,video compression,and video surveillance.The realization and optimization of fast and stable template matching algorithm has always been a research hotspot in the field of image processing.Template matching based on correlation coefficients is one of the most important algorithms in the field of image matching.It is characterized by memory access/computation-intensive and large space for performance optimization for specific target architectures.At present,a variety of classic image processing algorithms including correlation template matching are not implemented for M6678 DSP architecture.To promote the application of domestic DSP chips in the field of image processing and artificial intelligence.In this thesis,the template matching algorithm based on the correlation coefficient is implemented to the M6678 platform.Combining the algorithm characteristics and the architecture characteristics of the target platform,performance optimization in terms of parallelism and locality is performed.The test results show that the optimized program performance has improved significantly,and makes full use of the platform's unique computing resources,which has reference significance for the implementation and optimization of other image processing algorithms on the platform.This thesis mainly does the following aspects of the implementation and optimization of the correlation template matching algorithm on the Feiteng DSP platform:1.This thesis analyzes the template matching algorithm and its complexity,as well as the support of the M6678's underlying development environment,and completed the transplantation and implementation of the correlation template matching algorithm on the M6678 platform.2.This thesis has carried out optimization research on data-level parallelism and instruction-level parallelism for M6678.Use branch elimination and branch extraction to eliminate redundant control flow,avoid hindering the discovery of SIMD(Single Instruction Multiple Data,hereinafter referred to as DSP)vectorization,and use vector inline instructions provided by the compilation environment to manually rewrite the core operation code;use loop expansion,statement rearrangement and other methods to improve instruction-level parallelism to take full advantage of the M6678 computing core's multi-function components,multi-instruction emission and other hardware features.3.An optimization method for image segmentation is proposed.By dividing the images to be matched,redundant calculations are reduced,cache pressure is reduced,and data locality and cache hit ratio are improved.And use a variety of cyclic transformation methods to improve data locality and data prefetch optimization to improve the efficiency of program memory access and hide memory access latency.This thesis has performed performance tests on the programs before and after optimization.The test results show that the performance improvement brought by vectorization and local optimization is the most obvious,reaching a performance improvement of 1.98 times.After other optimizations,the overall acceleration ratio reached 2.01 times.In addition,this thesis compares the performance difference of the program on TI-C6678 and FT-M6678 on two different platforms.The results show that after the optimization of FT-M6678 architecture features,the program's performance on the FT-M6678 platform is better than the TI-C6678 platform,which verifies the effectiveness of this thesis' s transplantation and optimization work.
Keywords/Search Tags:Feiteng DSP, template matching algorithm, SIMD vectorization, image segmentation optimization
Related items