With the rapid development of mobile Internet technology,more and more image information is stored in digital form on the Internet;the image has become an important network information carrier after the text.At present hundreds of millions of images are uploaded to the network every day,facing such huge amounts of image data,how to retrieve the required image resources quickly and accurately is a very important and practical research subject.Automatic image annotation is the core content of text-based image retrieval technology,its essence is to automatically learn the mapping relationship model between semantic concept space and visual feature space using the labeled image dataset,and use this model to label a new image.Aiming at the limitation of traditional visual features,on the basis of the achievements of the deep convolutional neural network in the image process field,and driven by the Internet image data,in this dissertation,we center on the research of how to make use of feature learning ability of deep convolutional neural network in image annotation,and focus on three aspects:single-label annotation,multi-label annotation and multi-feature fusion annotation,the main work is summarized as follows:(1)Aiming at the problem that the number of labeled images is insufficient in the specific application domain,by using image datasets of the related fields,the deep convolutional feature learning method based on transfer learning is proposed.This method is mainly aimed at the problem that the deep convolutional neural network is easy to be over-fitting when the image dataset is small and the sample size is limited;by adopting the transfer learning method,we first pre-train the deep network to learn the low level visual features by using the large scale image dataset;then the target dataset is used to fine-tune the network parameters,and the middle and high visual features of the image are learned;the experimental results show that our proposed method makes it possible to use the deep learning method for small scale image datasets,and effectively improves the image annotation performance.(2)Aiming at the problem that the sample misclassification is easy to be produced by the categories with higher similarity,based on the idea of transfer learning and fine-grained classification,a two level hierarchical feature learning method is proposed.In order to minimize the number of misclassified samples,firstly,according to the general features the image categories with higher similarity are divided into the same subset,then,the different feature between similar image categories is extracted by using the feature learning ability of deep convolution neural network;finally,we propose an image annotation method based on two level hierarchical feature learning,which can effectively improve the accuracy of image annotation.(3)Aiming at the problem that global feature extraction is difficult and representation ability is insufficiency for multi-label images,we propose a multi-label image ranking method based on deep convolution feature by modifying the loss function of the network.In order to extend the feature learning ability to the multi-label image annotation task,we use multinomial logistic loss function to adapt to the multi-label image data and retrain the network;finally,we rank the multiple labels of image based on the extracted deep convolution feature,the semantic information of the image is more complete.(4)Aiming at the problem of how to make full use of the multi-source heterogeneous image features at the background of image big data,a multi-feature fusion image annotation method based on multiple kernel learning is proposed.Image resources in addition to the image itself,but also includes images of shooting time,location,latitude and longitude,height,the surrounding environment and other information.For the image semantic annotation task,we transform the descriptive information related to the image into one of the basic features of the image,and fusion with the deep convolution feature,a multi-feature fusion image annotation method based on multiple kernel learning is proposed.The experimental results show that our proposed method can reflect the semantic information of the image more fully and accurately. |