Font Size: a A A

Research On Some Key Techniques Of Image Semantic Understanding

Posted on:2017-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:1108330503957536Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
The perception and recognition of the human visual system for the images are a process from the image to abstraction, which add much prior knowledge as a guide. Correspondingly, the research of the image understanding can be divided into three levels: bottom-level understanding, middle-level understanding and high-level understanding. We study the main technical challenge for the three levels of image understanding, which are the three key techniques of the image semantic understanding: effective representation of image visual features, reasonable segmentation of multi-regions, enlargement of the contextual information in the image annotation. During the researches, the novel data models and expression forms are proposed, and new algorithms are designed to apply for each key technique. Specifically, the major contribution of the thesis is listed as follow:(1) Due to solve the problem of the visual features for describing the image content, we proposed a spatial fuzzy link color histogram(SFLCH) with the human vision perception. SFLCH can depict the color distribution and the spatial information of pixels to characterize the color image. The proposed method is simple to implement and insensitive to the distortion of the images. Moreover, the concept “color complexity” is defined as the degree of pixel color variation in local image regions in order to add the influence of human vision perception to image classification and recognition. The weight similarity measurement is created with the visual complexity to improve the effectiveness of image visual features and obtain better results for the image classification.(2) In view of the problem of the sensitiveness to the initial conditions and noise in FCM algorithm, the spatial FCM algorithm with the initialization scheme about region salient colors is presented. The pixels closest to the salient colors in the image are selected as the initial cluster centers, and the color different degree of adjacent pixels are defined to avoid selecting noises as the salient colors. The accuracy of the clustering algorithm is improved by avoiding the sensitiveness to the initial conditions and noise in FCM algorithm. By finding the salient colors in the simple color data set, the experiment results show that the representative region colors has the same effect as the actual visual, and the initial cluster centers are contained in each region. The fuzzy factor is defined to incorporate the local spatial information. The fuzzy factor is a variable changed with the different spatial location of the pixels so that the integration of spatial information is self-adaptive to improve the robustness of the algorithm and obtain better segmentation accuracy.(3) To improve the stability of the segmented region and achieve the automatic image segmentation, a multi-region image segmentation method based on unsupervised graph cuts is proposed. Original image data is transformed into high-dimension feature space via the implicitly nonlinear kernel mapping of data term so that the effect of region segmentation is improved and the calculation of the method is reduced. At the same time, the gradient constraint is introduced into smooth term in order to reduce the over segmentation. The optimization of the energy function including multi-label is solved by the iterative approach to the minimum value which used the multi-label switching algorithm. The labeling of each pixel is optimally configured. The proposed segmentation method has better segmentation results that accord with the human visual perception. We need not set the seed pixels for the segmented regions in advance. The multi-region image segmentation is automatically achieved by the computer. The desired image segmentation results for semantic understating are obtained.(4) Without the contextual information, the pixel labeling can cause the inconsistent results of semantic annotation. To make full use of the contextual information in the object recognition and scene understanding, a multi-granular context conditional random field(MGCCRF) model is presented to combine contextual information in a variety of scales. It is efficiently implemented through extending the pairwise clique to the multi-granular context windows. In the fine-granular context window, the label consistency of similar features can be obtained with the probability of the label transferring between two adjacent pixels. At the same time, the spatial relationships among different classes in the coarse-granular context window are explicated in details. To train the MGCCRF model, a piecewise training method with the bound optimization algorithm is designed to improve the performance. The proposed model is more competitive and effective in terms of the quantitative and qualitative labeling performance.The innovations of this paper are as follows:(1) Propose SFLCH, combine the fuzzy extraction of color variables with spatial information that describes the color distribution of pixel colors in different regions, and create the similarity measurement based on the vision complexity. It is successfully used to improve the effectiveness of the visual features.(2) Present spatial FCM algorithm with the initialization scheme about region salient colors. With the improved ?initialization? operator, the algorithm avoids selecting noises as the salient colors and incorporates the adaptive spatial information into the clustering calculation to enhance the accuracy of the image segmentation.(3) Present unsupervised graph cuts method. Data term is nonlinearly mapped by kernel function so that the effectiveness of segmentation is improved. The gradient constraint is introduced to smooth term in order to reduce the over segmentation. Simultaneously, initial parameters are set by unsupervised method without user interactions to achieve automatic image segmentation.(4) Propose MGCCRF that apply the granulation of granular computing to expand the captured windows of the contextual information. At last, the accuracy of pixel labeling and the recognition capability of objects are improved. This method has declared national invention patent “Pixel Semantic Labeling with Multi-granular Contextual Information”(Application number: 201510430264.5).
Keywords/Search Tags:Image Semantic Understanding, Fuzzy logic, Spatial information, Vision perception, Region segmentation, Contextual information, Pixel labeling
PDF Full Text Request
Related items