Font Size: a A A

Key Technology Research Into Visual Perception Inspired Object Discovery

Posted on:2016-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z MaFull Text:PDF
GTID:1108330509954717Subject:Aviation Aerospace Manufacturing Engineering
Abstract/Summary:PDF Full Text Request
Object discovery is to discover the objects present in the images by analyzing unlabeled data and searching for re-occurring patterns. It is one of the basic problems of computer vision research, and has a wide range of applications in civil and military use. Therefore, object discovery is becoming the very active research area. With the development of information technology, image data scale becomes larger and larger, with respect to such a large image data, how to be able to quickly and accurately complete the object discovery task has become the focus of attention. Therefore, how to quickly narrow down the search scope becomes one of the important research aspects in the object discovery technology. Besides, since the existing image description methods have difficulty in accurately describing the real interesting areas in the image, the performance of the existing object discovery technology is limited.This dissertation focuses on the object discovery as it main topic, by studying the mechanism of the human’s visual attention, introduces the powerful image understanding and pattern recognition capability that belong to human into the research of object discovery, proposes a new way to approach object discovery. The main research work and innovations include:1) This dissertation proposes a real eye tracking data based computational visual attention model, which can imitate the visual attention of human under free view condition. Concretely, a free view eye tracking database was constructed. Then, this dissertation proposes a framework of computational visual attention model based on the Markov chain, and defines the relationship between the real eye tracking data and the transition probabilities of the Markov chain. And trains a support vector regression on the the real eye tracking data, to predict the transition probabilities of the Markov chain based on the extracted image features. And finally obtain the saliency map of the image by estimating the equilibrium distribution of the Markov chain. The experiment is conducted, and verifies that the proposed model can detect interesting objects, paving the way for the subsequent object discovery research.2) A visual attention based visual vocabulary is proposed. First, the affine invariant regions in the images were detected, then the saliency of these regions were estimated through the proposed computational visual attention model. The salient regions, which are also the ones will attract human’s attention, were selected to construct the visual vocabulary through vector quantization, and the non-salient regions were discarded. Based on the proposed vocabulary, category recognition was achieved by using Naive Bayes classifier and support vector machine, respectively. The proposed method has solved the problem confronted the traditional visual vocabulary construction, that is to build a more effective vocabulary, always need more data, which also causes the computation problem. The experimental results show that, by using the proposed visual attention based visual vocabulary, the accuracy of category recognition is improved.3) A visual attention based Bag-of-visual-word image representation method is proposed. First, obtaining the saliency map of the image through the proposed computational visual attention model. Then, based on the obtained saliency map, weighting the visual words occurred in the image according to the saliency values at the words’ place, and representing the image with these weighted visual words. The proposed method have solved the poor performance problem that is caused by the traditional Bag-of-visual-word image representation, which equally treats all regions in an entire image, viz can not distinguish between the target and background. And based on this visual attention based Bag-of-visual-word image representation method, the object discovery is achieved through the k means algorithm and the probabilistic latent semantic analysis model, respectively. The experimental results show that the proposed method boosts the performance of object discovery.4) A computational visual attention model that can simulate the human’s visual attention while they looking for a specific category of objects is proposed, and using this model to achieve the object discovery and localization inspired by visual perception. Through the analysis of the human visual attention mechanism and the real eye movement data, an object discovery method that through directly simulating the visual attention of human while they looking for a specific category of objects is proposed. First, the eye tracking database that recording the eye movements while people looking for a specific category of objects was created. Then, aiming at the key problem that computational visual attention model, that can simulate the human’s visual attention while they looking for specific category of objects, havs differently to extract the objects’ s categories features, a probabilistic latent semantic analysis model based algorithm, that can learn the generic category of the objects and their locations, is proposed. Finally, feedforward neural network is constructed, and the computational visual model aimed at mimicking the human’s visual attention while they were looking for specific category of objects is trained on the eye tracking database that recording the eye movements while people looking for specific category of objects was created. And the object discovery and localization are achieved with this model.
Keywords/Search Tags:Object discovery, Computational visual attention model, Markov chain, Bag-of-visual-word, Saliency map
PDF Full Text Request
Related items