Font Size: a A A

Research On Image Annotation Based On Fusion Of Multi-modal Neighbor Relations

Posted on:2019-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:Q JiFull Text:PDF
GTID:2438330551960785Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the continuous popularity of digital cameras and the rapid development of the Internet industry,images and other multimedia information data are showing an explosive growth trend.Compared with the semantic text information,images and other multimedia information are more vivid and easier to understand.The application scope of the visual images is also very wide,such as medical treatment,education,multimedia,military and many other fields.The explosive growth of the number of images has brought new development prospects and exploration challenges for the research and related applications based on the visual images.However,although the massive of image information data brings a lot of the convenience to human beings,at the same time,it also brings many problems to be solved.It is very difficult for people to find the information which they need in such huge image information data.Therefore,the urgent problem to be solved is that how to utilize the massive image data more efficiently to meet the needs of the people accurately and quickly,which has become one of the hot topics in the field of computer vision in recent years.And one of the key technologies to solve this problem is image annotation technology.In order to solve the problem of image annotation,by combining with the existing annotation algorithms,this paper presents three novel image annotation algorithms,and designs and implements an interactive image annotation system.The main contents of this paper are as follows:(1)Based on multi-modal fusion by collectively exploring visual and semantic information for image annotation,Visual-Semantic Nearest Neighbors(VS-KNN)method is proposed in this paper.Due to the traditional image annotation algorithms always ignoring the semantic similarity of the visual images,we propose the fusion of visual similarity and semantic similarity of images during the training phase.More comprehensive image information can be obtained by using this method,which efficiently improves the performance of image annotation..(2)A 2PKNN method based on Group Sparse Reconstruction,named 2PKNN-GSR method,is proposed in this paper.After analyzing the traditional image annotation algorithm,we can find that the obtained predictive labels are incomplete,insufficient and noisy to describe the whole semantic content of images.In order to solve the above problems,this paper makes use of group sparse reconstruction method to improve the image annotation performance.(3)An image annotation method based on the multimodal fusions and group sparse reconstruction is proposed,and the method is named VS-2PKNN-GSR method.First,this method obtains the correlation matrix between the images and the labels by using the VS-KNN method.Then,it optimizes the correlation matrix by using the group sparse reconstruction in the 2PKNN-GSR method.Finally,due to the existence of a certain relationship between the labels,we further optimize the correlation by using the sparse method.Thus,the performance of image annotation algorithm can be improved effectively.(4)In order to make the effect of the above theoretical research more intuitive,an interactive image annotation system is designed and implemented.According to their own understanding,users can make further screening on the labels of the testing image so as to achieve good interactivity between the system and users.
Keywords/Search Tags:K-nearest neighbor, visual similarity, semantic similarity, sparse reconstruction, image annotation
PDF Full Text Request
Related items