Font Size: a A A

Sparse Coding Theory Of Visual Perception And Its Application

Posted on:2007-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Y LiFull Text:PDF
GTID:1118360185954199Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There are 80% stimuli which are processed by visual system among the human brainsensory information. Visual information processing mechanisms have become an intense studyin psychology, neuroscience, and computer science for recent two decades. It has long beenhypothesized that the early visual system is adapted to the input statistics. Such an adaptation isthought to be the result of the combined forces of evolution and neural learning duringdevelopment. But we do not know how the biologic visual system responds to input visualpatterns? From the viewpoint of information theory, Attneave, Barlow and Olshausen&Fieldput forward sparse code theory. They argued that it was an important constraint to sparselycode the input stimuli for the neurons in primary visual cortex(V1). So such neurons couldeffectively code as much information as possible under the condition of limited avaiablecomputing resources. Sparse code theory establishes a scientific quantitative link between theinformation processing mechanisms of visual neurons and the statistics of input visual stimuli,and provids an efficient tool to understand the neural information processing mechanisms. So itattracts increasing interest in the field of neuroscience, neural network and artifical intelligence.Based on the sparse code model and theory, this thesis probes new efficient coding modelsand theories motivated by information processing mechanisms of human brain, and appliesthem in image retrieval task. The main contributions of this dissertation are as follows:First, task-oriented sparse coding model. Recent research results in psychology andneuroscience showed that information processing procedure of simple cell in V1 was not just adata-driven procedure, furthermore, it was also influenced by perception tasks. So we design atask-oriented sparse coding model (TOSC), based on the sparse coding model ofOlshausen&Field. TOSC probes into what information visual neurons should code, that iscalled "what problem". In order to improve the discriminability of coding coefficient space, wecombine the supervised information-discriminant distance into the cost function of sparsecoding model. Therefore, we can get the TOSC model after optimization learning for the newcost function. Our experiments on classification problem (classifying natural scenay imagesand building images) vertify the efficiency of TOSC.Second, two-layer feedback sparse coding model. Recent research results demonstratedthat information processing procedure of primary visual cortex was not just local featureextraction, on the contrary, it was dynamic, interactive and plastic. It is adjusted according tovisual reasoning and perception tasks. This thesis extends the single layer, feedforward sparsecoding model, and puts forward a two-layer feedback sparse coding model (TLF-SC) based onthe multi-layer perception network. In TLF-SC model, neuron responses are not onlyinfluenced by sparse coding rule, that is to keep their responses statisticly independent, but alsotuned by higher layer perception task. Our simulation results show that ICL neurons in TLF-SCmodel resemble the simple cell in V1, at the same time, they emerge the adjustability to theperception task. Furthermore, TLF-SC model shows good classification performance.Third, attention-guided sparse coding model. We found that the percent of activatedsimple cell in sparse coding model was still high, more than 70% among a simple cell groupresponding to the same stimulus. However, the computing resources in neural system islimited, at the same time, the input stimuli are not equally important. So we integrate theattention mechanism with sparse coding model and bring forward the attention-guided sparsecoding model(AGSC). Our simulation experiments show that AGSC model can furtherimprove the sparseness characteristic, moreover, it can filter the nonsignificant information andreserve the main information.Fourth, ICA coefficient texture feature. Texture feature is the most popular low-levelvisual feature in content-based image retrieval (CBIR ). This paper puts forward an approachwhich learns the statistical texture basises from the texture image set based on sparse codingmodel. These texture basises have the same response characteristics as Gabor filter, moreover,their response coefficients have good statistical independence. Based on the coefficients ofsuch basises, we bring forward representation scheme and extraction algorithm about ICAcoefficient texture feature. Our experiments on Brodatz set and VisTex set show that ICAcoefficient texture feature has two characteristics: low-redundance for dimensions andhigh-pertinence with image set, consequently, it outperforms the Gabor texture feature fortexture image retrieval.
Keywords/Search Tags:Sparse Coding, Efficient Coding Hypothesis, Perception Learning, Attention, Natural Image, Principle Component Analysis, Independent Component Analysis, Simple Cell, Content-based Image Retrieval, Texture Feature, Gabor Filter
PDF Full Text Request
Related items