Font Size: a A A

Research On Embedded Audio-video Synchronization Coding In H.264

Posted on:2013-01-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:X N LiFull Text:PDF
GTID:1118330371982979Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of information and computer technologies, communicationmethods have been transferred from the traditional unique media to the multimediacommunication consisting of audio, texts, images and video. Multimedia communicationtechnique is the product of communication technology and multimedia technology. It isconstituted by the computer interaction, the multimedia composite and the communicatingnetwork distribution. Moreover, it breaks the traditional communication system whichapplies unique media communicating method, provides users with comprehensive dataservices and becomes one of the most popular future developments of communicationtechnique. In multimedia communication system, it is impossible to avoid introducingsignal delaying and shaking to the system which can create changes of relativerelationships within multimedia in the process of transmitting, grouping and exchangingof multimedia data. Regarding the multimedia data which are processed by compressedcoding or converged to one terminal through different channels, the above situation ismuch more serious. Therefore, one of the most important problems in multimediacommunications is to maintain the synchronous status with each media. Recently,multimedia synchronous technique is attracting more and more attention as one of mostimportant problems of quality of service(QoS) in multimedia communications.For the multimedia with audio and video streams, the synchronous systems consist offive parts: data collection, compression coding, network transmission, decodingreconstruction and synchronous displaying. The main research on audio and videosynchronization is basically focusing on the synchronous controlling of audio signals andvideo signals in the process of data collecting, sending, transmitting and receiving. Andfinally, the synchronous displaying can be achieved in the customer terminal.Currently, the synchronization between audio and video is mainly realized byindicating time-stamp method. Indicating time-stamp is based on an ideal decoder. Thisdecoder assumes that the channel buffer cannot be overflow or underflow and theprocessing of the code streams is instantaneous and ideal. Actually, it is impossible for areal decoder. Internationally, the lip synchronization algorithm is also presented to handlewith problem of the audio and video synchronization in the applications of videoconferences and video telephones. Nevertheless, to implement the algorithm is a complexprocess, e.g., locating mouth cannot be implemented automatically and it needs artificial intervention.In order to solve this problem, the research group supervised by Prof. Hexin Chenproposes synchronization coding theory by embedding audio into video. And the grouphas implemented quite a number of works to develop the theory and achieved prominentachievements. In the process of audio and video synchronization controlling, the audiosignals are taken as hidden information to be embedded into videos. Then the video withembedded audio signals is compressed and encoded. At the decoding terminal the audioaignals are extracted according to embedding rules. The synchronization coding theory byembedding audio into video has obtained excellent compressing coding and achievedcomplete synchronization transmit between audio and video. Moreover the theory hasovercome asynchronous receiving problem caused by the channel delay and the disunitycoding. The group has successfully applied the theory to MPEG-2and AVS standards inthe first period. This thesis, based on the previous achievements, will research on thesynchronization coding by embedding audio into video in H.264video compressingstandard.Based on the Projects of International Cooperation and Exchange NSFC"UbiquitousComputing based on synchronous coding of the video by embedding audio into video"andNatural Science Fundation of Jilin Province of China "Trust Computing based onOpportunity mode in Ubiquitous environment", this thesis represents detailed audio andvideo synchronization technique, compares current synchronization schemes and carriesout overall analysis. Moreover, the thesis represents systematical and overall analysis ofthe core technique and important modules in H.264video compressing standard andprovides excellent basis for constructing audio and video synchronization coding theory inH.264.This thesis will analyze intra/inter prediction coding, CALVC entropy coding andmotion estimation in H.264and propose several different embedded synchronizationcoding schemes. Based on different modules embedded, the synchronization schemes canbe sorted into three types:(1) Audio-video synchronization coding based on mode selectionIntra/inter prediction coding is an important part of H.264video standard. Eachcoding method consists of several different coding modes and mode selection is the corepart of each coding process. By analyzing the mode selection schemes of intra/interprediction coding and applying the diversity of inter coding modes, this thesis proposestwo audio-video synchronization coding schemes based on information hiding.Embedding the audio signals into video streams as hidden information realizes the synchronization coding between audio and video and achieves synchronizationtransmitting.Both of the two audio-video synchronization coding schemes apply the diversity ofinter prediction coding modes. Different coding modes carry different audio signals. Audiodata is embedded into video streams by selecting inter prediction coding mode. These twoschemes can realize audio-video synchronization coding and decoding. In the first scheme,the coding mode is selected according to the embedded audio signals. However, thismethod can not guarantee that the selected coding mode is the optimal one, so this willlead to extra embedding cost and increase coding bit rate. The second scheme modifies thefunction in the first scheme which only applies audio signals to select the optimal codingmode. The second scheme firstly groups the coding modes, and then selects the codingmode group according to the audio signals. Finally it selects the optimal coding mode byapplying rate distortion optimized algorithm. Using this method will obtain the codingmode which is more closed to real optimal one. Moreover, this method will cause lesseffect on video quality, embedded costs and coding bit rate.(2) Audio-video synchronization coding based on CALVCBy applying trailing coefficient and the last nonzero coefficient except trailingcoefficient, this thesis proposes two audio-video synchronization coding schemes based onCAVLC. In the first scheme, the thesis analyses the coding characteristics of the sign bit oftrailing coefficient in CAVLC entropy coding and proposes an embedding method basedon trailing coefficient. The trailing coefficient sign bit is a fixed length coding and thetrailing coefficient is located in the high frequency component of the4×4block. Theproposed method which modify the trailing coefficient sign bit to embed audio signals,will not increase coding bit rate or cause much effects on video quality. The secondscheme is applying the last nonzero coefficient except the trailing coefficient to embedaudio signals. Since the embedding algorithm has amplitude difference±1at most onnonzero coefficient, effect on video quality and coding bit rate is rather slight.(3) Audio-video synchronization coding based on motion estimationThe thesis analyzes the1/4pixel motion estimation. Experiments show that applyingdifferent1/4pixel to search for the optimal matching points has slight effects on the entiremotion estimation. By modifying1/4pixel points, the thesis proposes an audio-videosynchronization coding scheme based on motion estimation. Considering the parity ofhorizontal component MVyand vertical component MVxof1/4pixel points motion vectorsMV, we divide the1/4pixel points into two search groups and embed the audio signalsinto video by selected search groups. The experiments show that the method realizes audio-video synchronization coding with slight effects on video bit rate and quality.To verify the feasibility of the proposed schemes, the thesis realizes the schemes inH.264/AVC using referenced C code JM11.0, and applies different video sequences to testthe above schemes. Moreover, the thesis has evaluated the proposed schemes by theindicators such as the video quality, the embedded cost and the change of bit-rate. Theexperimental results show that the proposed schemes have achieved audio-videosynchronization coding with low embedding cost and good performance in quality ofaudio and video signals.
Keywords/Search Tags:H.264/AVC, synchronization coding, mode selection, ACVLC entropy coding, motion estimation
PDF Full Text Request
Related items