Font Size: a A A

Research On Visual Perception Models And Coding Algorithms

Posted on:2009-03-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:W L YangFull Text:PDF
GTID:1118360275954637Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Visual perception and coding is one of basic problems in the field of computationalneuroscience. Its objective focuses on developing novel principles of neurocomputing andsystems of visual information processing, by using the mechanism of visual cortex basedon the achievements from neurophysiology and cognition science. Study on mechanism ofvisual information processing and computational principle is of theoretical significance innot only revealing neural computing mechanism and developing novel computing models,but also promoting development of new architectures for information technology, such asartificial vision system, vision rehabilitation, machine cognition, and novel human-computerinterface. Moreover, It has a wide range of applications from pattern recognition, identityvalidation, safety surveillance, to intelligent human-computer interfaces.Based on sparse coding strategy, this paper investigates general computational frame-work for visual perception, including learning from image sequences the self-organized mapsof receptive fields of simple and complex cells in the primary visual cortex, constructinghierachically perceptual models for perceiving objects in stimuli, transformations such astranslation, rotation, scaling, and motions. Main contributions of this dissertation are listedas follows.In order to represent statistical characteristics of natural scenes, we apply independentcomponent analysis (ICA) algorithm on the natural image training data to basis functions,which are localized, oriented, and bandpass, resembling the receptive fileds of simple cells inthe primary visual cortex founded in the neurophysiological experiments. The correspond-ing coefficients of independent components are considered as the neuronal responses thatare consonant with sparse and supergaussian probability distribution. By the second-ordercorrelation between neighboring responses, we derive the self-organized learning algorithmbased on Natural Gradient, called NGTICA, which is able to learn spatio-topological mapsof receptive fields of simple cells from natural scenes.To extract spatio-temporal features, we propose a model based on invariance represen-tation in the visual cortex. By extending the NGTICA algorithm, we obtain STICA algo-rithm adapted to the model for extracting spatio-temporal features from image sequences and videos. These features re?ect certain invariance properties, such as translation, rotation,scaling, and view angle. Moreover, we elucidate the sparse and supergauss distribution ofresponses of complex cells when these spatio-temporal features act as the receptive fields ofneurons.For perceiving objects and translational motion of stimuli, we model the visual path-ways of'what'and'where'. We propose a three-layer perceptual networks and two cor-responding algorithms, OPA and TPA, are developed for objects and translational motionperception respectively. The computer experiments show that the proposed model and per-ception algorithms are able to perceive objects and translational motions with a high accuracyand strong robustness against additive noise.We propose a rotation perception model for perceiving rotational transformation fromsequences of stimuli and an algorithm, called RPA, is developed by taking the correlationbetween responses as an invariant measure. Further, we propose a generalized model whichcan be used to perceive certain type of motion by using the corresponding receptive fields inthe model.For the head pose estimation problem, one of important preprocessing in face recogni-tion, we propose a novel ICA-based model inspired by the mechanism of visual perception.The receptive fields of neurons are learned from multi-view facial images by STICA. Wefurther propose a corresponding perception algorithm based on neuronal firing rate. Com-puter experiments are given to verify the performance of the proposed algorithm. Furtherexperiment data analysis shows that responses are described as a manifold in the high-dimensionality subspace spanned by the multi-view bases when neurons are stimulated bydifferent view facial images. This exciting result establishes the reasonable fundamental forthe perception algorithm. Taking into account facial images in?uenced by lighting, expres-sion, view, and age, we apply tensor factorization to extracting multi-factor representationfor faces and propose a TF-based model for facial view estimation and an algorithm with themeasure of correlation between tensor representation and view factor. Computer simulationresults verify that the TF-based model provides better results than the ICA-based one.
Keywords/Search Tags:Visual Perception, Neural Coding, Spatio-temporal Feature, Object and Motion Perception, Head Pose Perception, Independent Component Analysis, Tensor Fac-torization, Natural Gradient, Visual Cortex, Receptive Fields
PDF Full Text Request
Related items