Font Size: a A A

JM Decoder Parallel Optimization Based On NEON Engine

Posted on:2014-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ChenFull Text:PDF
GTID:2248330398976050Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The latest ARM Cortex-A series processor based on the ARMv7architecture. It integrated with NEON Media Processing Engine for the first time. The NEON engine provides advanced SIMD (Single Instruction Multiple Data) instruction set on ARM platform. SIMD can process multiple data in one instruction to realize the data-level parallelism. The NEON engine can effectively speed up the multimedia applications, such as audio codec, video codec and image processing.On general ARM platform, multimedia applications, such as audio applications, video applications and2D/3D games, usually are implementedby conventional ARM instruction set, due to the restrictions of hardware cost and power consumption. However, for ARM processor integrated with NEON engine, NEON instruction can be used to optimize the part that having high computational complexity and suitable for data parallelism, and enable data to parallel compute rather than serial computing. Thus NEON can be used to accelerate the applications’processing speed. It has large applied value for demanding hardware cost and power consumption on ARM processing platform.Currently, the applied research on NEON is still less, especially on the NEON technical feature, instruction functionality, applied method and practical applied effect. Besides, there is few available library based on NEON. Therefore, in this paper we deeply studied the NEON technology which based on the ARM Cortex-A9processor built in the NEON, including its technical feature, instruction functionality, applied method, etc. Then we applied the JM decoder to realize the optimization by NEON. In this paper we showed how to optimize four modules by NEON in JM Decoder, which is the official implementation of H.264. The modules optimized include inter-frame motion compensation, deblocking filter, inverse transform and intra prediction. Finally we tested the performance of JM decoder after optimization, and we hope the result will supply reference to the researcher and developer who intend to do the optimization using NEON technology.Test platform included PandaBoard development board and Ubuntu system based on Linux3.4.0. Several test sequences in different resolutions are used to decode. The test contained the optimized performance for each module and the overall decoding rate. The test results showed that the inverse transform was the most effective one in all optimized modules. Besides, the overall decoding rate was increased almost1times after optimization, and the performance did not diminish with the larger resolution. It proved that on the ARM platform NEON technology for acceleration of multimedia applications is obviously effective.
Keywords/Search Tags:ARM CORTEX-A9, NEON, JM Decoder, H.264, Multimedia ApplicationAcceleration
PDF Full Text Request
Related items