The new high efficiency video coding standard HEVC was proposed firstly in April 2010 by JCT-VC. The main of it is to enable significantly improved compression performance relative to existing standards, in the range of 50% bit-rate reduction for equal perceptual video quality. In order to achieve this goal, HEVC must be use higher complexity video algorithms, and also result a high computational complexity.Based on the theories of HEVC intra prediction, this paper analyzes the LCU divisions and prediction modes decision, and proposed two optimization algorithms:an improved division method of HEVC coding units for high quality video scenes and a fast selection algorithm based on mode grouping for intra prediction mode under time-critical scenes. These algorithms reduce the computational complexity and improve the efficiency of intra-coding. Moreover, intra prediction algorithm is designed in a serial manner in HEVC, so calculates among the pixels will be influenced by data loading, and 35 kinds of prediction modes need to be processed orderly. This will increase the coding time of intra prediction. So parallel schemes of reference pixel smoothing and fasten prediction mode selection are put forward based on DPR-CODEC(Dynamic Programmable Reconfigurable Array Processor, another programmable and reconfigurable array processor designing for Video codec) to deal with the problem. The proposed schemes save clock cycles required for data loading on a single processing unit and reduces the total time of mode prediction.The main works are as follows:1. Give an improved division method of HEVC coding units:In order to reduce the high computational complexity in High Efficiency Video Coding (HEVC) intra prediction, an improved method of rate distortion optimization (RDO) based on statistics was proposed in this paper. Based on statistical analysis of the rate distortion cost under different quantization parameter probabilities distribution, threshold equations of different depth in the process of Large Code Unit (LCU) division were given. Which were used to end the division of code unit, and then reduce the computational complexity. The experimental results show that, compared with HEVC HM10.0 testing models, the proposed algorithm can save an average 26.7% of encoding time with negligible loss of coding efficiency (only 0.5% bitrate increasing, and 0.0019 dB Y-PSNR(Y-Peak Signal-to-Noise Ratio) loss), promote the coding efficiency.2. Give a fast selection algorithm based on mode grouping for intra prediction mode:To further reduce the high computational complexity of rough mode decision (RMD) and rate distortion optimization (RDO) resulting from the increasing prediction modes in High Efficiency Video Coding (HEVC) intra prediction. This paper proposed a fast selection algorithm based on mode grouping for intra prediction mode. On the basis of the strong correlation between the first rank prediction mode in the candidate mode centralized arrangement and the optimal prediction mode, the algorithm reduces the computational complexity apparently. The experimental results show that, compared with HEVC HM10.0 testing model, the proposed algorithm can save an average 41.8% of encoding time with negligible loss of coding efficiency (only 0.78% bitrate increasing, and 0.12 dB Y-PSNR(Y-Peak Signal-to-Noise Ratio) loss), promote the coding efficiency.3. Rearch on a parallel scheme of reference pixel smoothing:HEVC test model is designed for single processor system. So the reference pixel smoothing process in it is a serial manner. As a result, the filter operation will be delayed by data loading of relative pixels.Therefore, this paper makes all pixels’data loading atone time, then performs unified reference pixel smooth filter. This parallel design scheme makes the serial/parallel speed up to 14.43.4. Rearch on a parallel scheme of prediction mode selection fast algorithm:Considering the computational efficiency and resource constraints of the DPR-CODEC, only the great probability directions are selected out for prediction mode, according to the strong correlation between the direction of image texture and the prediction direction. The parallelization scheme is:16 PEs in a cluster are used to calculate 16 pixels seperately in a parallel pattern, and 12 kinds of mode calculations are completed in each PE. This scheme saves a lot of problem time of prediction mode calculations that used for waiting each other pixels under the serial way. And it makes prediction mode selection of multiple pixels parallelization. The simulation results show that the mode selection design of intra prediction makes serial/parallel speed up to 7.60 and improves the operational efficiency. |