Font Size: a A A

The Research On Sign Language Video Encoding Under Energy Constraints

Posted on:2015-03-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:X L ChenFull Text:PDF
GTID:1268330428481229Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Sign language is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Compared with head and shoulder video, sign language video is more complex and the reaseach about it is challenging.Currently, the reaseaches about sign language video encoding are limited and mostly based on Rate-Distortion theory to achieve the minimum distortion of decoded sign languge video. However, the R-D theory mainly research on the relationship between Rate and Distortion under the rate constraint. With the rapid development of wireless communication, the enhancement of the wireless channel bandwidth, and the popularity of Advanced Video Coding standard H.264, the constraints on the rate become weaker and weaker. At the meantime, the processing capabilities insufficiency of mobile devices and the microprocessor’s power-constraint problem caused by battery power become the major restriction to the development of mobile sign language communication.This dissertation conducts in-depth research on sign language video encoding. The work aims to achieve the optimal balance among encoding power, encoding rate and encoding distortion by utilizing the visual selection attention mechanism of deaf community, Power-Rate-Distortion theory and regions of interest power allocation method.In general, the research of this dissertation can be summarized as follows:(1) The factors which will affect the complexity of sign language video encoder are analyzed at first. Based on the analysis results, a novel computation resource allocation algorithm is proposed. The algorithm can allocate the computation resource of the encoder adaptive to available battery power and video contents. Experimental results show the proposed algorithm can highly reduce the computation resource while maintaining video coding quality.(2) A scheme which allocates the computational resource of the sign language video encoder adaptive to available battery power and deaf people’s visual system is proposed. In the scheme, encoding levels which determine number of reference frames and search range are adaptively selected according to the battery power and frame complexity at frame level. Then possible partition mode and quantization parameter are adaptively adjusted at the macro block (MB) level according to the relative priority of each MB. Experimental results show that the proposed algorithm obtains better peak-signal-noise-rate of face and hands that improves the intelligibility of sign language video, the computation complexity of encoder is reduced further.(3)An analytic P-R-D model to obtain optimized tradeoffs among power consumption, bit rate, and distortion for sign language video encoding is proposed. In particular, numbers of different macroblock (MB) coding modes are intelligently controlled through an optimization process according to their distinct P-R-D performance. Both the analytic and simulation results have shown the applicability of our scheme for mobile sign language video encoding.(4) A novel algorithm to track the hand during hand over face occlusion in sign language video is proposed. The algorithm is based on image force field transformation. First, the frames with a hand occluding the face and those with only a face are transformed to force field images. Then the force field images are partitioned into sub-images and the histograms of each sub-image are calculated. For each sub-image, the histogram of frame with only a face is subtracted from the frame with a hand occluding the face to get the difference histogram. Finally, for each sub-image the difference histogram is compared to threshold to get the position of the hand. Experimental results show that the proposed algorithm is capable of real-time tracking of hand.
Keywords/Search Tags:Video Encoding, Power Aware, Visual Attention Mechanism, Power-Rate-Distortion, Sign Language Video
PDF Full Text Request
Related items