Semantic segmentation aims to identify the semantic classes of objects and locate their boundaries in an image.It is one of the fundamental issues in computer vision.Although after decades of research,semantic segmentation systems have not performed as well as expected when dealing with complex natural scenes.As a problem of pixel-level classification,semantic segmentation aims to obtain regions with visual homogeneity and semantic consistency.It constantly strives to forge ahead on issues such as identifying semantic classes,segmenting complex objects,and localizing boundaries.The goal of this dissertation is to propose algorithms that can overcome the difficulties faced by semantic segmentation in essence,improve the accuracy of segmentation,and enhance the adaptability of the segmentation system to complex scenes.The semantic segmentation system proposed in this paper is applied to the content-based image semantic retrieval task and can obtain high-accuracy retrieval results that are more in line with human understanding.The main contributions of this dissertation are as follows:1.In the framework of troditional machine learning,an image segmentation method based on unreliable depth is proposed.Firstly,the color image is pre-segmented using the mean shift algorithm with adaptive color bandwidth.Then the probability boundaries of the color image is fused with depth probability boundaries to obtain the estimation of the reliable probability boundaries.Finally,the pre-segmentation result is corrected by using the reliable probability boundaries.The method can use the depth information effectively,reduce the over-segmentation caused by the color change,and solve the problem that the objects with similar colors are difficult to separate when they are blocked by each other.2.Using deep learning as a tool,several semantic segmentation algorithms based on deep neural networks are proposed.Firstly,from the viewpoint of the effective use of image context,a deep convolutional Markov random field method using depth is proposed.Based on this model,the image context long-range dependency between color,position,and depth can be established.It can improve the performance of semantic segmentation in semantic label compatibility and predictive object continuity.Secondly,combining the advantages of the traditional method and the deep learning method,a RGB-D based regularized fully convolutional neural network is proposed.Due to the use of shallow artificial features instead of the features extracted from deep neural networks,the redundancy of information representation is reduced.The method can effectively reduce the number of layers of the semantic segmentation network and improve the efficiency accuracy of semantic segmentation.Finally,in order to localize the boundaries of semantic objects accurately,a widening residual refine edge reserved neural network for semantic segmentation is proposed.The model uses a wide residual cross-layer structure to achieve the integration of low-level structural features and high-level semantic features.The residual refine feature pyramid can realize the fusion of multi-resolution features and enhance the segmentation of multi-scale objects.In deep learning based semantic segmentation system,the extracted features are not only the representation of the visual content,but also the representation of semantic.Applying it to content-based image retrieval can narrow the semantic gap.As an application of the semantic segmentation system proposed in this dissertation,semantic features and a simple cosine similarity is combined to implement content-based image semantic retrieval.In order to improve the efficiency and accuracy of retrieval,a two-step retrieval strategy was designed and implemented.Firstly,a hash code likewise layer is added in the semantic segmentation network,and a binary hash code of the image is obtained by using an appropriately threshold.Rough search can be achieved by combining the Hamming distance to obtain a search subset.Secondly,in the subset,the features extracted by the semantic segmentation network are used for fine retrieval,and thus content-based image retrieval can be efficiently and accurately implemented. |