Font Size: a A A

Software Adaptation And Optimization Of H.264 Video Decoding System Based On Loongson 2K1000B

Posted on:2021-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:F YangFull Text:PDF
GTID:2518306557490344Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Loongson2K1000B is an embedded SOC for industrial control and mobile intelligent terminals.Video decoding is an important application scenario.Its video subsystem includes a software decoder with CPU as the processing core and a hardware video processing unit(VPU).CPU decoder is generally based on High-level programming language.Although the decoding efficiency is lower than hardware decoder,it has the advantages of good format compatibility and flexible algorithm upgrade,and can be used as an important supplement for video decoding system in low resolution decoding.Although the decoding efficiency of VPU decoder is high,it needs to use a lot of physical address continuous memory.In order to improve the utilization of system resources,it is necessary to adapt and optimize its memory management methods.This thesis takes the widely used H.264 video format as the research object,and adapts and optimizes the software of its video decoding system based on the Loongson 2K1000 B platform.This thesis analyzes and optimizes the problems in the H264 video decoding of Loongson2K1000 B decoding system.For the CPU decoder,the performance of the current H.264 decoder based on FFmpeg was first evaluated,and the performance bottleneck module is found by using Perf and other performance analysis tools.Use performance analysis tools such as Perf to find the main performance bottleneck modules such as entropy decoding,motion compensation,loop filtering,and inverse quantization.Then,according to the calculation characteristics of the bottleneck module,the two time-consuming modules of motion interpolation and loop filtering were optimized by Loongson multimedia SIMD instruction.On this basis,the full use of Shift instruction,unpacking instructions and the rich register resources of the Loongson platform have been optimized for data loading and intermediate data access to improve the efficiency of CPU decoder.For the VPU decoder,the original memory reservation method and allocation method of the decoder are first analyzed.Then the Contiguous Memory Allocation(CMA)mechanism is adapted to optimize the VPU memory reservation,so that the VPU does not need to monopolize continuous physical memory for a long time.On this basis,the method of applying and allocating based on fixed memory blocks in the original memory allocation method is improved,and the method of applying and allocating dynamic memory according on the actual needs of the decoder is implemented to improve the true utilization of VPU memory.Finally,the optimization results were tested on the development board of Loongson 2K Pi.It is equipped with 2K1000 B and Linux operating system.For the CPU decoder,the experimental test results show that after the optimization of the Loongson SIMD instruction,the overall decoding frame rate is increased by more than 30%.Wherein 720 P resolution HD video decoding frame rate of 24 frames can be achieved..For the VPU decoder,the optimized CMA-based memory reservation method can control memory consumption within 64 MB and does not occupy for a long time.The optimized dynamic memory allocation method takes less than 250 milliseconds,and the true memory utilization of the VPU is higher than 95%.
Keywords/Search Tags:Loongson2K1000B, H.264 decoder, software decoder, SIMD instruction, hardware decoder, memory management
PDF Full Text Request
Related items