Font Size: a A A

Research On Monocular And Stereo Video Coding Based On Mesh Model And Related Technologies

Posted on:2010-12-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:D B GuoFull Text:PDF
GTID:1118360272482642Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Nowadays, with the increasingly maturity of H.264/AVC-based video coding techniques, many scholars believe that the statistics-based redundancy removal compression has tended to reach its limit. Future video coding techniques should find solutions in computer vision, computer graphics and human vision system. The mesh model-based video coding technique is one of new video coding techniques in which computer vision and computer graphics techniques is employed to represent image sequence in a structural way. It had been a research hotspot more than ten years ago for solving video communication problems in very low bitrates.The mesh model-based video coding techniques remain many problems to solve, such as its high computing complexity, its poor robustness, and that no effective solutions has not been found for the motion occlusion and the stereo occlusion problems. Previous studies in this field were only limited to simple background and simple motion applications, such as videoconference, etc.Aiming at the above problems, the main contributions in our works presented in the dissertation include:1. Base data structures are built for Delaunay triangular meshes (DTM) generation algorithm using triangular element classes, which include the vertex class, the segment class, the triangle class and the DTM class. They lead to speed increase in the mesh generation by one third without decreasing the approaching precision of DTM.2. Through the analysis of the poor robustness and the instability of two existed DTM generation algorithm, a new criterion termed with minimize sum of squared differences (MSSD criterion) in gray is derived for generating content adaptive DTM. The classical algorithm with nodal proximity constraints in temporal activity region, which is name as optic-flow method for short here, is optimized and improved for the generation of the motion adaptive DTM.3. A four-stage fast motion estimation algorithm is proposed based on nodal trajectories, in which nodes in motion occlusion region are removed and new nodes are added in uncovered background to guarantee effective nodal trajectories. In terms of the amount of added nodes or deleted nodes or mesh model failure, the sizes of regions to be occluded and uncovered is perceived, according to which adaptive GOP is constructed. Finally, a mesh-based hybrid video coding scheme is present. Experimental results show that the mesh-based video coding scheme outweighs the advanced motion estimation mode of H.263 in compression efficiency for complex background and motion videos in the mesh-based coding implementation built on H.263 reference model.4. A new algorithm for stereo disparity estimation by employing maximum a posteriori (MAP) criterion is proposed. It can introduce prior knowledge to the normalized correlation and MSE methods to increase matching performance.5. For videoconferencing applications, a fast four-stage disparity estimation algorithm based on nodal trajectory is proposed, in which illumination compensation between views and global occluded boundary region detection are studied. Furthermore, the virtual viewpoint synthesis is also investigated. Experimental results show in detail that the proposed algorithm overweighs other corresponding algorithms not only in speed but also in precision due to its fast convergence. In addition, meshes have advantages in speed and simpleness for the virtual viewpoint synthesis.6. In order to mark occluded regions explicitly on the disparity map, dynamic programming is employed to search optimal disparity curve on base of calculating disparity space at first. Each point on the optimal disparity curve must be in one of three states: matching state or other two occlusion states. To guarantee the disparity curve passing through ground control points (GCP), an algorithm of dynamic programming in segments is proposed, that is, the disparity space image is divided into ground control regions and non-ground control regions. In the ground control region, searching path is forced to pass GCPs. In the non-ground control region, optimal path searching is under dynamic programming. For the reliability of the GCP, four criterions are presented to choose a point as a GCP. Experimental results show that the new algorithm has certain enhancement in the precision of occlusion detection and matching, and is more reliable and faster than conventional dynamic algorithms.
Keywords/Search Tags:Video/Stereo video coding, Mesh-based coding, Adaptive GOP, Dynamic programming, Maximum a posteriori
PDF Full Text Request
Related items