| Image set is one of the important carriers of multimedia data,which is a collection of images belonging to the same category,such as video clips.Image set classification is a funda-mental research area in computer vision and pattern recognition.As a significant information processing technology in the domain of image and video understanding,image set classification has a wide range of applications in the scenario of artificial intelligence such as smart cities.The image set-based classification algorithms aim at performing representation learning on the data structure and semantic information of image sets,so as to achieve correct determination of the category to which it belongs.Over decades of development,the research scope of image set classification has been suc-cessively enriched in terms of theory,method,and dataset,ensuring the unremitted vitality of the research field.Although the image set covers large variations in the appearance of objects caused by changes in pose,viewpoint,illumination,motion,and background compared with sin-gle static image,how to reasonably encode such data variability information and how to effec-tively measure the similarity between image sets have become two major issues in the research community.In recent years,with the high performance achieved by Riemannian geometry-based methods in nonlinear representation of visual data,the research on image set classifica-tion based on Riemannian manifold learning has attracted widespread attention.Specifically,the integration of discriminant analysis theories(e.g.,metric learning)and Riemannian shallow learning methods leads to improved classification accuracy.On the other hand,the general-ization of the conventional neural network paradigm to the context of Riemannian manifolds provides more effective semantic information,further improving the accuracy and robustness of image set classification.In this dissertation,the author analyzes the deficiencies of the ex-isting Riemannian learning methods,and proposes effective solutions from the aspects of data modeling,network architecture,and objective function.The main contributions include:(1)An image set classification algorithm based on multiple Riemannian manifolds joint characterization and multi-kernel metric learning is proposed.Considering the complemen-tarity of statistics contained in different types of Riemannian manifolds,the image set data is modeled from the perspective of multi-manifold joint representation,which provides more com-prehensive information than a single geometric model.Then,the attention mechanism-guided multi-kernel metric learning framework is suggested for the sake of fusing the extracted multi-ple structured features into a unified subspace.This design not only alleviates the distortion of structural information by the data transformation process,but also enhances the discriminability of the generated low-dimensional representations,thus improving the classification accuracy.(2)An image set classification algorithm based on lightweight symmetric positive definite(SPD)manifold neural network is proposed.Inspired by the fact that the Riemannian shallow learning methods have limited representational capacity and time-consumed iterative optimiza-tion process,a lightweight Riemannian neural network is constructed on the SPD manifolds.With the aid of the designed Riemannian feature rectification layer,pooling layer,and kernel discriminant analysis(KDA)algorithm[1],the multi-stage nonlinear learning and discrimina-tive classification of SPD matrix can be realized.Besides,the unsupervised weight optimiza-tion mechanism based on the two-directional and two-dimensional principal component anal-ysis((2D)~2PCA)technique[2]simplifies the network design and improves the computational efficiency.(3)An image set classification algorithm based on SPD manifold deep metric learning is proposed.To solve the problem of structural information degradation caused by multi-stage data compressed mapping,on the basis of the original SPD manifold neural network(SPDNet)[32],a novel Riemannian autoencoder model is established on the SPD manifolds.The joint training of the desinged metric learning regularization term and the reconstruction error term en-ables the generated features of the hidden layer to be more informative.Additionally,these two objective functions can provide complementary supervisory information for the cross-entropy loss to characterize the feature distribution,improving the classification accuracy.(4)An image set classification algorithm based on deep SPD manifold neural network is proposed.Inspired by the semi-orthogonality of the Stiefel manifold-valued weight matrix,a stacked Riemannian autoencoder(SRAE)is built on the tail of the backbone for reference to the existing Riemannian network paradigm.Under the successive supervision of multiple recon-struction error terms,the embedding mechanism of SRAE and each Riemannian autoencoder(RAE)will approach an identity mapping,thus can effectively mitigate the information degra-dation problem caused by increasing the network depth.In addition,a two-stage metric learning is introduced for the designed Riemannian network,which further strengthens the classification accuracy. |