Font Size: a A A

Key Technologies And Applications Of Visual Data Recognition Based On Sparse Representation And Self-similarity

Posted on:2016-01-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P SunFull Text:PDF
GTID:1108330503952917Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As information and multimedia technology develop rapidly, visual data such as digital images and videos increase at an explosive rate, which has significantly influenced people’s social life in various perspectives. As the amount of visual data in the Internet is so huge, making replace of human beings with computers in intelligent processing and understanding of these data becomes an urgent request. Visual data recognition is the key component of intelligent vision systems. Though some progress has been made in research on visual data recognition, the resulting intelligent vision systems still encounter grave difficulties in adapting to a wide range of environmental changes, such as illumination variation, viewpoint changes occlusions. Therefore, exploring an effective way to represent visual data becomes the key in visual data recognition.Sparse representation and texture description are two of the most popular ways for visual data representation, which are also the main research objects of this dissertation. The crucial assumption in the sparse coding for visual data representation is that, visual data can be sparsely represented in a certain transform domain, which is also partially supported by the recent academic research on the mechanism of human vision system. Recently, sparse representation based learning algorithms have drawn much attention from the computer vision and pattern recognition community and led to state-of-the-art results in a variety of visual data recognition tasks, but there is still plenty of room for improvement. Besides, as textures exist in various visual data, a fundamental research project in visual data recognition is to investigate effective texture descriptions. Even though there have been a lot of research achievements in texture description, current texture description methods still await further improvements in terms of discrimination and robustness to a wide range of environmental changes.The focus of this dissertation is on the development of methods for representing visual data effectively in the context of recognition, with the employment of techniques of sparse coding, local pattern encoding and lacunarity analysis. The main contributions of this dissertation are listed as follows:1. A novel reweighted l2,1 minimization based dictionary learning approach, called Reweighted l2,1 minimization based dictionary learning for Structured Sparse Coding (RL21-SSC) is proposed for structured sparse coding in this dissertation. Instead of a traditional l2 regularization, a reweighted one is used in the RL21-SSC model for exploiting class-specific structured sparsity patterns, which is able to reduce the bias on large coefficients. Besides, the RL21-SSC method could detect the subspace of data from each class spanned by atoms of the dictionary, which is able to enforce distinct structured sparsity patterns on the sparse codes of samples from different classes. Experimental results on face recognition, scene classification and action recognition have demonstrated the competitive performance of the RL21-SSC method in comparison with some latest dictionary learning methods.2. By adapting subspace ensemble learning to supervised sparse coding, an effective supervised dictionary learning model, called Ensemble Classifier based Dictionary Learning (ECDL/EasyDL), is proposed in the dissertation, which unifies the processes of compact dictionary learning and ensemble classifier training. An efficient numerical scheme is developed to solve the EasyDL model, in which the dictionary and classifiers are simultaneously updated instead of being sequentially learned. The resulting sparse codes has strong discrimination for recognition and weak dependence on the peculiarities of training data. The experiments in a variety of image recognition tasks have shown the improvement of the EasyDL method over several state-of-the-art approaches.3. A novel discriminative structured sparse coding method, called Collaborative HIerarchicaL Discriminative Dictionary Learning (CHILD-DL), is proposed in the dissertation, which simultaneously learns a structured dictionary with hierarchical group sparsity and trains a linear classifier for classification. As the within-class collaborative hierarchical sparsity promoting functional is introduced in the CHILD-DL model, the coded samples from the same category are encouraged to share the same sparsity pattern at the group level but not necessarily at the singleton level. Benefiting from joint dictionary and classifier learning, the discriminability of sparse codes is further strengthened. Experimental results on face recognition, object recognition and scene classification have demonstrated the effectiveness of the CHILD-DL method in comparison with existing discriminative dictionary learning approaches.4. Based on the concept of lacunarity in fractal geometry, a static texture descriptor, called Pattern Lacunarity Spectrum (PLS), is proposed in the dissertation, which characterize the self-similar behaviors of spatial distributions of multi-scale local binary patterns using lacunarity analysis. The PLS descriptor was applied to texture classification and the experimental results on four benchmark static texture datasets have demonstrated its excellent performance.5. A novel lacunarity analysis based dynamic texture descriptor, called Space-Time Pattern Lacunarity Spectrum (ST-PLS), is proposed in the dissertation, which characterizes dynamic texture by describing the irregularities of the spatial and temporal distributions of local space-time patterns extracted by some effective local pattern encoding schemes. The experimental results on several benchmark datasets have demonstrated the power of the ST-PLS descriptor in comparison with existing ones.
Keywords/Search Tags:visual data recognition, sparse representation, dictionary learning, structured sparsity, supervised learning, texture description, lacunarity analysis
PDF Full Text Request
Related items