Font Size: a A A

Multimodal interface integrating eye gaze tracking and speech recognition

Posted on:2016-05-12Degree:M.SType:Thesis
University:The University of ToledoCandidate:Mahajan, OnkarFull Text:PDF
GTID:2478390017976918Subject:Electrical engineering
Abstract/Summary:
Currently, the most common method of interacting with a computer is through the use of a mouse and keyboard. HCI research includes the development of interactive interfaces that go beyond the desktop Graphical User Interface (GUI) paradigm. The provision of user-computer interfaces through gesturing, facial expression, speaking as well as other forms of human communication have also been the focus of intense studies.;Eye Gaze Tracking (EGT) is another type of Human Computer Interface which has proven useful for several different industries, and the rapid introduction of new models by commercial EGT companies has led to more efficient and user-friendly interfaces. Unfortunately, the cost of these commercial trackers have made it difficult for them to gain popularity.;In this research, a low cost multi-modal interface is utilized to overcome this issue and help users adapt to new input modalities. The system developed recognizes input from eyes and speech. The eye gaze detection module is based on Opengazer, an open-source gaze tracking application, and is responsible for determining the estimated gaze point coordinates. The images captured during calibration are grey-scaled and averaged to form a single image; they are mapped relative to the position of the users pupil and the corresponding point on the screen. These images are then used to train a Gaussian Process which is used to determine the estimated gaze point. The voice recognition module detects voice commands from the user and converts them into mouse events. This interface can be operated in two distinct modes. One mode uses eye gaze as a cursor-positioning tool and voice commands to perform mouse click events. The second mode uses dwell-based gaze interaction, in which focusing for a predetermined amount of time triggers a click event. Both the modules work concurrently when using multimodal input.;Several modifications were made to improve the stability and accuracy of gaze, albeit within the constraints of the open-source gaze tracker. The multimodal implementation results were measured in terms of tracking accuracy and stability of estimated gaze point.
Keywords/Search Tags:Gaze, Tracking, Multimodal, Interface
Related items