With the development of science and technology,8K Ultra High Definition(UHD)video is gradually entering people’s vision,it will play an important role in industrial production,medical health,video surveillance,and other fields.On Feb 28,2019,China stressed the need to vigorously promote the development of ultra-high definition video to achieve the goal of "4K first and 8K second".In addition,breakthroughs in 5G technology have improved network quality,which makes possible the application of more efficient Versatile Video Coding(VVC)standards.Due to the combined effect of market,policy,and technology,the research and development of an 8K ultra HD video encoder is quickly put on the agenda.As the latest generation of video coding standards,VVC not only increases the compression rate by 50% but also brings higher algorithm complexity,which makes software implementation face the shortcomings of low throughput rate,high delay,and low coding efficiency.Therefore,hardware acceleration has gradually become an important means to improve the efficiency of video coding.As an important part of VVC,Motion Estimation(ME)plays an important role in improving the video compression rate.However,most of the existing hardware designs do not support VVC,and the data throughput rate and matching accuracy are low,which cannot meet the requirements of ultra-high-definition videos.Therefore,this paper focuses on the hardware design of motion estimation in VVC,and the main contents are as follows:Firstly,based on the basic theory of motion estimation,the performance of8K@30fps video is evaluated.By reducing idle cycle and balancing flow,a hardware architecture with CU32 granularity as flow is designed.On this basis,a hardware architecture supporting Integer-pixel Motion Estimation(IME)and Fractional pixel Motion Estimation(FME)is designed.Verilog is used to implement the hardware.In IME,a hardware architecture combining coarse search and full search is designed,improving IME’s throughput.In FME,an 8-pixel parallel interpolation architecture is proposed to improve the utilization of hardware resources.In addition,the spatial motion vector prediction is optimized and improved,and a hardware architecture of rough motion vector prediction is designed,which can reduce the data dependence with the post-stage rate-distortion optimization module,improve the coding efficiency and reduce the complexity of hardware design.Then the Universal Verification Methodology(UVM)is used to simulate the hardware,ensuring the function’s correctness and the verification’s completeness.According to the performance analysis,2480 clock cycles are required for computing a Coding Tree Unit(CTU)by motion estimation.The DC synthesis results show that the maximum operating frequency is800 MHz,the number of gates is 1039.1 Kgates,and the power consumption is 94.2m W at 12 nm process and 8bit depth.At 28 nm process and 8-bit depth,the maximum operating frequency is 700 MHz,the number of gates is 1232.7 Kgates,and the power consumption is 151.6 m W.The real-time encoding of 8K@30fps ultra HD video is realized,which can provide a reliable hardware solution for motion estimation. |