Font Size: a A A

Wearable-vision-based Human Computer Interaction

Posted on:2011-03-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Q LiFull Text:PDF
GTID:1118360308955598Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Wearable-vision-based human computer interaction is a novel research topic which involves computer vision, human computer interaction (HCI), wearable computing and cognitive psychology. Different from computer-centered conventional HCI, Wearable-vision-based human computer interaction exploits wearable, human-centered and cooperative vision computing and perception approach, and hereby provides a natural and efficient interaction and perception interface among human, computer and environment.This paper focuses on research work of wearable-vision-based HCI, including virtual touchpad based interaction, adaptive object tracking, and interactive visual perception. This thesis has developed a wearable interaction system for a testbed platform. The system is composed of wearable computer, miniature stereo vision machine, wireless microphone and head mounted display. The miniature stereo vision machine synchronously captures gray images and dense depth maps, and transports them to wearable computer through the IEEE 1394 port. Meanwhile, the wireless microphone captures the audio inputs from the wearer. With the stereovision and speech, our wearable interaction system is capable of providing several natural and flexible interaction approaches.Under wearable computing environments, natural and efficient gesture interactions still face many difficulties and challenges. Inspired by the touchpad, a stereovision based virtual touchpad interaction method is presented which consists of two key techniques: pointing gesture tracking and touch event detection of the index fingertip. This method discriminates the interaction gestures and unconscious gestures and solves the Midas-Touch problem of visual interactions in a natural way. To achieve robust pointing gesture tracking methods for complex environments, this paper combines multiple features such as contour, depth and Local Binary Pattern (LBP) under the ICONDENSATION framework. The fingertip location in 3D space is formulated as computing the intersection point of the hand plane and the fingertip projection line to handle the location error due to missed depth data. The experimental results demonstrate the effectiveness and robustness of the proposed approach.To handle dynamical changes of illumination, viewpoints, camera jitter and partial occlusion, this paper presents an adaptive object tracking method based on co-training and particle filtering algorithms. The histogram of oriented gradients (HOG) and LBP feature are used to describe the object appearance and construct two SVM classifiers separately. The classifiers are online updated by a co-training framework which overcomes the error accumulation problem. In order to reduce the searching state space, we adopt dynamical model and importance sampling function to improve the precision and efficiency of the sampling procedure. A correction term is introduced to decrease the weights of false positive samples, and hereby improve the performance of object tracking and classifier updating. Extensive experimental results verify the effectiveness and robustness of the proposed adaptive tracking method.Human-in-the-loop, which is the typical characteristic of wearable computing, can be used to achieve visual computing in a human-machine cooperative approach. This paper describes a multimodal interactive visual perception method to train/retrain a model of object perception with the wearer's guidance. When the uncertainty of perception results is larger than a given threshold value, the computer turns to the wearer for verifications. To achieve the multimodal labeling, an object is encircled by visual tracking of a pointing gesture, and meanwhile its name is obtained by speech recognition. This labeling method uses complementary property of two modalities to improve the naturalness and efficiency of object labeling. To reduce human intervention, we adopt an adaptive object tracker to collect samples automatically. Based on active learning, the interactive machine learning method chooses the most informative samples from the obtained sample stream to construct the perception model. A feature descriptor invariant to illumination, brightness and contrast is used to improve the discriminative power of the perception model. Experimental results demonstrate the effectiveness of the proposed multimodal interactive visual perception method.
Keywords/Search Tags:wearable vision, human computer interaction, virtual touchpad, object tracking, interactive visual perception
PDF Full Text Request
Related items