Font Size: a A A

Visual Cognition Mechanism-based Semantics Acquisition In Natural Image

Posted on:2017-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:B F NanFull Text:PDF
GTID:1108330485950017Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
There is a need for obtaining the semantics of images to simulate the visual cognition mechanism of human or other higher organisms, and to achieve perception, recognition and understanding of the image scene by the computer.As early in the process of human visual organization and perception, human vision will physically locates the local regions which are the objects in the final description of the scene. The local regions are with certain semantic information, such as contours of object or region and local detail information. And then, the visual perception sysyem automatic focus on certain region based on the regon contrast which is the visual difference among the regions’shape or local region feature in clutter scene. Then, the cognition system achieves the semantics of the whole scene finally, which centre on the salient object or region. Therefore, this thesis starts from the lowest level of pixel visual content to extract the local regions which adhere well to the object or region boundaries using superpixels segmenting.Then, it is to built the saliency detection model to detect the salient object or region in the image combined with the local region feature. Finally, with the salient object or region and the relevant information as prior knowledge,it is to bulit the image automatic annotation model using neural network of deep learning to acquire the natural and flucent semantic description as human. The main contributons of this thesis are as following:1) For the local region extraction, a new superpixel segmentation method using fusion with texture information based on SLICO is proposed. It fuses the texture features reflecting the natural boundary and outer contour of the object or region in the segmentation process. Meanwhile, the searching strategy that searches around a circular area of the seed pixel is adopted to make the segmented superpixels further approximate the local region or the outer contour of the object and guarantee relatively rapid segmenting with regular size and shape superpixels. In th final, experiments are conducted on public databases BSDS500 to compare superpixels segmentation of certain size quantitatively and the results reveal that the proposed SLICO-t segmentation method shows superiority to the SLICO method in the Boundary Recall,such as about 8% or 9%.2) For building saliency detection model combined with global and local region information, firstly a Local Texture-based Region Sparse Histogram model is proposed to describe local region features, which integrates the advantage of local region texture patterns, color feature as well as the shape information. Then the saliency detection method proposed on this model can separate the detected salient object or region from background scene clearly and completely. Besides, the salient object or region has a relatively complete outer contour, shape feature and local detail texture information. Finally, by performing experiments on public databases for quantitative comparison and analysis, results show that the proposed saliency detection method is superior to the other five state-of-the-art saliency detection methods in a certain measurement,such as the Precision Rate, the F-measure and the Mean Absolute Error(MAE).3) In the image automatic annotation and semantic content acquisition process, we proposes a double-mapping mechanism image semantic annotation method based on saliency features as priori knowledge. Firstly, it uses saliency visual features as priori knowledge to perceive the salient object or region in the first stage. Then, on the basis of the perceived salient object or region, local region features in the whole image are re-used for further mapping. In this double-mapping process, two visual features are used for training and learning. It is an self-learning process based on neural networks. Meanwhile, in the encoding process of the image and the semantic information of description, the order-preserving mapping method that has been proven to be successful is used for mapping, thus relatively accurately revealing the potential relationship between image and semantic description information.Finally,by training, validation and testing on three public databases (Flickr8k,Flickr30k and MSCOCO), experimental results reveal quantitatively that the proposed method outperforms currently published methods in performance of Recall@K(k=1,5,10). And the obtained semantic content is more accordance with human visual cognition mechanismin with the natural and fluent description.Besides.the research results will play a beneficial role in image region extraction,image segmentation and relevant wider areas,such as image understanding.
Keywords/Search Tags:superpixels segmentation, saliency detection, deep learning, caption generation
PDF Full Text Request
Related items