Font Size: a A A

On Facial Expression Recognition Based On Machine Learning And Techniques

Posted on:2015-08-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Ongalo Phoebe Nasimiyu FedhaFull Text:PDF
GTID:1488304322470564Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
For several decades our options for interacting with the digital world has been limited to use of mouse, keyboards and joysticks. This objects limit machine's ability to respond to human needs. As a result they have exposed an existing gap between human and machines creating room for renewed interest in affective technology. Affective computing refers to development of system and devices that recognize, interpret, process, and simulate human affects. It brings together individuals from diverse backgrounds like engineering, computer science, psychology, cognitive science, neuroscience, sociology, education, psychophysiology, value-centered design and more to push for boundaries on what can be achieved to improve human affective experience. Emotion is fundamental to human experience; it influences our perception such as learning, communication and decision-making. However, many technologists have largely ignored emotion and create often frustrating experience for people, in part because affect has been misunderstood and is hard to measure.As machines and people begin to co-exist and cooperatively share a variety of tasks, we see a trend that is shifting from text, to full-body video-conferencing as virtual communication grow rapidly in availability and complexity. Even as this richness of communication options improve our ability to converse with others not available at the precise moment, the fact that something is missing continue to plague users of current technologies presenting a need for affective communication channels between humans and machines. Humans are experts in expressing, recognizing and distinguishing facial expressions without much effort yet for computers this is an uphill task.To be effective in the human world, machines must respond to human emotional states. In real life situations a lot of communications that use facial expressions need no words at all to explain the situation at hand. Many people do a great deal of inference from perceived facial expressions as conveyed by statements like "You seem happy today." Facial expressions can give a true picture about one's feeling; while you may say that you are feeling fine, one look at your face may tell people otherwise. These expressions make more than a statement about a person's emotional state and mostly occur to reinforce or contradict what has been said. Automatic facial expression recognition aim at perceiving and understanding emotional states of humans based on information found in the face. These expressions reveal a part of the feeling inside a person's private life that could be extracted without intrusion making the ability to understand them more interesting.Naturally facial expressions may suggest certain emotions visually to an observer while in reality they express a totally different emotion resulting from some proportion of the expression resembling the other. To avoid this confusing outcome we need a well-designed state of the art smart human machine interface that interpret facial behavioral change giving an insight to a variety of information that could be derived from a face. Automatic facial expression recognition is a technique that could be harnessed to solve this problem. It is considered a high-end task in computer-vision and range in complexity from locating the face in images to recognizing isolated objects belonging to a few well-established classes. Our research is built upon this theories that advance basic understanding of affect and its role in human experience. We believe it is for the same reason that researchers are making every attempt at understanding relation that exist between affective signals, mental states and how they can be extracted and tapped to bridge the gap that separate man from machines.Despite their efforts computers still remain disconnected from the human world. Computer systems are not intelligent enough to pay attention to people and their surrounding environments. This raises need to build systems with perceptual intelligence that can realize what aspects of the surrounding environment are important and use it to interpret who, where, what, when and why of the situation. This way machines become intelligent enough to adapt their behavior to suit human needs rather than humans adapting to theirs. The answer to these questions lie in having an automated measurement system that is able to detect a human face, preprocess, extract features and classify objects in different categories. Facial recognition accuracy depends heavily on how input images have been compensated for pose, illumination and facial expressions.The aim of this thesis is to analyze and extract behavior referenced by facial features that could be used to convey user emotions or internal state of a person in an effort to solve human machine problem. Our goal is to design a robust, high performance facial expression recognition system that can work in real-time situation with ability to observe, interpret, and generate affect features that react appropriately to natural human behavior. Building such systems is no easy task, first is due to various unpredictable facial variations that are further complicated by external environmental conditions. In addition is the difficulty in choosing suitable characterization of facial feature descriptors capable of extracting discriminative facial information given the influence of illumination, pose among other factors. Facial variations that arise from pose, age, gender, race and occlusion, also exert profound influence on the robustness of facial systems. Facial muscular assume the appearance passion people want to experience therefore a suitable facial feature descriptor would largely determine its performance giving characteristic upon which people can make conclusion regarding the person, ideas or status. Secondly, the contextual events that signify facial expressions and emotional states still remain ambiguous and vague to computers.Our computation approach addresses the curse of dimension, mapping the image into low dimension coordinate system that preserves perceptual quality of objects enabling feature extraction. The approach entails highlighting objects of interest with aid computer vision technologies where we map and track facial muscle movement, decode and interpret these movements having leveled illumination and eliminated noise effects before feature extraction and classification. A major part of our contribution is to determine optimal ways of combining different algorithms in an attempt to improve facial recognition accuracy. We have developed a system for evaluating and extracting appropriate facial features to be used in measuring facial expression recognition problem.Previous studies have used texture and geometric features for expression recognition, however few of them conduct and explore performance differences between features arising from image quality. The first key step is to recognize a human face with acceptable levels of false positives this has been achieved appropriately. Following this, enhancement is carried out to provide better quality data for purposes of machine interpretation before feature extraction. For emotion recognition, the extraction of meaningful patterns from the gathered data must be found. A practical approach, used in this thesis is simulation of emotions using image processing, computer vision and machine learning techniques to enable process facial expression detection enrich and facilitate interactivity between human and machine, giving rise to labels (angry, fear, sad, surprise, neutral, happy and disgust).The first algorithm is based on watershed segmentation an approach that accurately replicate the shape pattern of objects most needed in machine learning. Segmentation is fundamental to image analysis; it entails division of image into regions of similar attributes aimed at discovering objects of interest. To eliminate drawbacks associated with watershed segmentation we adopt contrast enhancement between brighter and black image regions at the scale corresponding to different scales of structuring elements and image details based on morphological tophat and bottomhat transform operators. By incorporating this knowledge watershed segmentation is used to separate facial features from the background, highlighting regions of interest eyes, nose and mouth required for subsequent high level machine learning. Enhanced facial features are extracted using principle component analysis, a small set of eigenvectors with top eigenvalues are used to build principal components. Each principle component is considered a feature representing a point in high dimensional space. Image features obtained are used to train the neural network from this a recognition rate of98%was acieved.Using a different approach we quantify machine perception of facial features based on internal structural information found within an object by mathematically reconstructing it from a series of projections. Radon transform was used to capture a series of line integrals across the image at different angles, capturing directional local features lying within the object. The extracted intensity profiles were filtered by applying the Fourier Slice Theorem before contributions from each line of response were summed and used to arrive at an estimate of the facial shape. The method is invariant to rotations, lighting and noise in addition it is possible to reconstruct the image even when presented with incomplete data. To determine the most efficient feature extraction technique experiments were carried out separately using the wavelet transform and discrete cosine transform.Specifically, Haar discrete wavelet transform was used to compress image objects to a smaller more manageable data that facilitate extract enhanced facial features. Filtered back projection is used to reorganize internal structural information within objects by mathematically reconstructing it from a series of projections. These set of images were projected at180degrees and low frequency components extracted to form significant data used to train the Neural Network (NN) classifier. Experiments were carried out to assess validity and efficiency of this algorithm. The results reveal non-enhanced data gave a recognition rate of97.2%while enhanced data gave99%accuracy in determination of emotional state from facial expression recognition. While an approach that combine discrete cosine transform and principal component analysis used125projections to give a recognition rate of97.79%for non-enhanced data and enhanced data achieved98.99%accuracy in determination of emotional state from facial expression recognition. In an alternative method adaptive wavelet lifting scheme was used to extract facial features for expression recognition. The lifting scheme makes optimal use of similarities between high and low pass filters to achieve a faster implementation of the wavelet transform. To take advantage of non-linear filters we incorporated median filters to utilize its potential at compressing signals these results from the fact that the two algorithms share a similar structure. Facial images like any other digital image contain highly correlated data and its high pass component can be discarded without much effect on the visual quality of the image. For this purpose Haar lifting scheme combined with principle component analysis were used to exploit directional correlation between adjacent pixels. A recognition rate of99.4%was achieved on JAFEE database and97%on PICS databases respectively. This was a more improved performance compared to98%and98.4%for PCA and DWT-PCA on the same images from JAFFE database. From this results feasibility of algorithms are demonstrated and proved effective at minimizing noise and other irregularities, demerits associated with conventional methods providing an alternative perspective towards facial expression recognition problem. Based on experimental analysis carried out we reveal that emotions in machines can be associated with abstract states empowering them to respond to people's needs.The motivation behind this research a rose from the unobtrusive nature the face uses to communicate things that are difficult to embrace with written or spoken words. In addition its potential applications are many to mention a few is that such systems help deter crime, fraud and streamline business processes to save critical resources. Currently an increasing number commercial applications are either using or actively considering use of facial recognition methods as witnessed in fields like medicine and education while others in need of this technology are clinical psychology, psychiatry, neurology, pain assessment, lie detection, criminology and multimodal human computer interface. A lot is yet to be achieved in relation to interpreting facial messages and how they are produced by the neuro-muscular system. We believe that recognition of human facial expression using computers is key to promoting these fields. A system that can perform these operations accurately and in real time would form a big step in bridging the gap that exists between man and machines. This probably explains the reason why a human face is not only at the center of security concern but also a sudden renewed interest in this field. The goal of this research was to apply image analysis and machine learning techniques to examine, extract and measure facial expression. Using this theoretical background we have examined face expression in detail, in an effort to understand and capitalize on those features of the face that can help distinguish different classes of facial expressions, increasing its significance as a problem solving method that would ultimately lead to its increased adoption in many domains. In particular, this thesis provides evidence that use of enhancement in facial feature extraction not only improves recognition rate of facial expressions but also improves the scalability and robustness of the resulting system.
Keywords/Search Tags:Facial expression recognition, mathematical morphology, top-hat transform, bottom-hat transform, watershed segmentation, discrete wavelet transform, neural networkclassifier, discrete cosine transform, wavelet lifting
PDF Full Text Request
Related items