Font Size: a A A

Research And Application On Computational Model Of Visual Attention

Posted on:2010-02-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:J W ChenFull Text:PDF
GTID:1118360275994527Subject:Artificial Intelligence
Abstract/Summary:PDF Full Text Request
Visual attention is regarded as an essential cognitive process of human visual system. Human vision relies on visual attention mechanism to select the relevant parts of scene, on which higher level tasks can be processed. Since information from only a small region of the visual field can progress through the cortical visual hierarchy, visual tasks can be effectively dealt with by limited processing resources. Visual attention models are based on the biological model of visual attention, which mimic the ability of a visual system. Researching on computational model of visual attention is not only helpful in understanding the working mechanism of human visual system, but also has important application in image analysis and understanding, object detection, information retrieval, robot vision, video communication, and etc.The dissertation addresses the research on visual attention mechanism and its computational methods. The cognitive neuroscience theories and the neural mechanism of visual attention are analyzed. According to the requirement on computer vision, architecture of dynamic visual attention model based on the biophysics and neurophysiology theories of human visual processing is established. This architecture is mainly composed of three parts: feature processing, attentional capture and attentional control. A system of two-pathway based hierarchical feature process is proposed. The extraction of depth features and motion features is realized to measure the third spatial dimension and the time-scale of the complex environments. The integration between the two pathways in brain is simulated by an integrate-and-fire neural network (IFNN), which is employed to compute the focus of attention. The approach of attentional control for dynamic visual attention model is developed to mimic the sustained attention mechanism. According to the theories and techniques, a visual attention model for dynamic scenes based on object selection is implemented.The cognitive neuroscience theories and neural mechanism of visual attention are analyzed. To meet the requirement of computer image processing by summarizing the latest study of the biological vision, the biological enlightenment for computer vision is provided. Under the idea of bionics, the architecture of dynamic visual attention model is established, which relates computer image information processing with biophysics and neurophysiology. The model is mainly composed of three parts: feature processing, attentional capture and attentional control.Three major problems are solved in feature processing: feature selection, saliency computation and the hierarchical processing of feature. On feature selection, the features are classified into two types: spatial features and non-spatial features. The extraction of spatial features which include depth features and motion features are realized to measure the third spatial dimension and the time-scale of complex environments. The non-spatial stimulus features include intensity, color, orientation, and etc. Saliency computation depends on the computation of feature contrast. According to the differences in hierarchy and function of different features, the processing of spatial features and non-spatial features simulates the two pathways processing in brain. The saliency map of each feature is created by the competition or integration of the sub-features.On attentional capture, the approaches of feature integration and attention focus are developed based on the two-pathway theory. We use the non-spatial features (including intensity, color and orientation) as the perceptual information about object, which are transmitted in "what" pathway. The perception of spatial and motion information related to "where" pathway are presented by depth features and motion features. An integrate-and-fire neural network (IFNN) is employed to simulate the three-way relationship between the two inputs and response to achieve a dynamic and modulatory property. The correlation between two input sets is implemented by the IFNN, which produces a certain amount of gain when two inputs are consistent. In the case that the stimuli from two pathways are correlated, the interspike interval will be shortened. The focus of attention is allocated at the possible target position in the original image after the interspike interval is calculated.According to neural mechanism of visual attention, the approach of attentional control for dynamic visual attention model is developed. An arousal signal is used to measure the strength of the new stimuli in scenes. The arousal signal is defined as parameters for sustained attention judgment. A tracking algorithm is proposed to mimic the sustained attention mechanism in dynamic scenes. If the arousal signal cannot over the threshold, the movement of the focus of attention has to be related to the tracking process. If the arousal signal over the threshold, a method of location enhancement is developed to enhance the saliency of the new stimuli.The computational model of visual attention for dynamic scenes is the target of the research. The theories and approaches are combined to implement the model, which has a strong applicability and important applied value. The experiment results also show that our computational theories and techniques applied in the system are valid and effective.
Keywords/Search Tags:Visual attention, Visual saliency, Salient location, Focus of attention, Feature extraction
PDF Full Text Request
Related items