| Video content-based annotation can search and classify user video more efficiently and intelligently,so it has been an important research topic in the field of computer vision.However,traditional machine learning requires a lot of manpower and material resources to collect and annotate enough video samples to train a better model in the video content annotation problem.Therefore,the thesis uses the idea of transfer learning to transfer the knowledge in the Internet image domain to the video domain to complete the task of annotating video events.The thesis first transferred the knowledge of the single image source domain to the video target domain,and proved the robustness and stability of the transfer algorithm.Then,based on the one-source domain's one-sidedness,a multi-source domain was carried out based on its transfer method.Finally,the annotation task on the video target domain was completed.The main research work of this thesis is as follows:(1)Aiming at the heterogeneity of image domain and video domain,this thesis proposed a model of heterogeneous space feature mapping.Through this model,the features of image domain and video domain can be projected into a common feature space.In order to further reduce the difference between the mapped data features,this thesis proposed a subspace alignment model to align the image space to the video space and then implement single source domain knowledge transfer.And the thesis explained with a probabilistic explanation that the model is robust to some noise in the data.At the same time,the relevant upper bound of the model was deduced to prove its stability.(2)Based on the two models in(1),Heterogeneous Compound Transfer Learning(HCTL)was proposed.This method used the first-map and then-aligned compound transfer idea to complete the annotation of video content.The experimental results show that the average annotation accuracy of Heterogeneous Compound Transfer Learning(HCTL)method on the Kodak database reaches 35.89%,which is 22.78%,16.64%,14.74%,12.90% and 7.68% higher than that of DASVM,DAM,DSM,MDA-HS and DCA methods.In the CCV database reached 22.92%,a relatively increase of 42.36%,30.38%,24.84%,27.69% and 19.38%.(3)In view of the one-sided knowledge of single-source domain,this thesis proposed a multi-weight based multi-source transfer learning method(MW-MSTL)based on single-source domain knowledge transfer method.According to the degree of correlation between the source image group and the target domain video,this method gave different weights for different source image groups.Finally,based on the smoothness hypothesis,a target classifier that can classify user video was trained.The experimental results show that the average annotation accuracy of MW-MSTL method on the Kodak database reaches 45.64%,which is 12.66%,11.24%,8.54% and 5.97% higher than that of CP-MDA,DAM,DSM and MDA-HS.In the CCV database reached 39.77%,a relatively increase of 9.86%,13.47%,6.71% and 10.11%. |