Font Size: a A A

Cross-domain Action Recognition Algorithms Via Deep Spatial-Temporal Network

Posted on:2021-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:L M GuoFull Text:PDF
GTID:2518306464980909Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of deep learning and more and more largescale video motion recognition datasets have been published,the research on action recognition algorithms has gradually matured.However,these motion recognition algorithms are often trained and tested on datasets with the same data distribution and trained with a large amount of labeled data.However,in actual applications,it is often necessary to train the action recognition model in one or more application environments(source domain),and then use the model in the current application environment(target domain).And it is impossible to label all the data in the current application environment(target domain),resulting in very little labeled training data in the target domain.Therefore,the research on cross-domain action recognition algorithms for small data is an urgent problem to be solved.In view of this problem,the research work of this paper mainly includes the following parts:1)Pairwise Deep Adaptive Two-Stream Conv Nets for Cross-domain Action Recognition with Few Data(PTC)is proposed.In this algorithm,first,a method for selecting difficult samples based on the sphere boundary method is proposed to select the samples with the longest distance in the same class and the nearest samples in different classes.At the same time,a large number of pairwise samples are formed based on these samples to solve the problem of little labeled data in target domain.Secondly,the weight distribution layer is used to give discriminative weights to the key frames of the video in the pairwise two-stream network.Then,the maximum mean discrepancy algorithm is used to measure the distribution difference between different domains and at the same time,the contrastive loss function is used to optimize feature distribution to improve the distinguishability of features.Finally,the adaptive fusion layer is used to achieve joint optimization of RGB network and optical flow network.Due to the lack of corresponding cross-domain action recognition datasets,two crossdomain action datasets SDAI Action I and SDAI Action II are also constructed in this paper.Experiments on these two datasets demonstrate that the PTC algorithm has obvious advantages in accuracy and convergence speed.2)Pairwise Attentive Adversarial Spatial-Temporal Network for Cross-domain Few-Shot Action Recognition(PASTN)is proposed.In this algorithm,first,an end-toend pairwise TR3 D network is learned to represent the spatial-temporal representations of video actions.Second,a double attentive adversarial network is introduced into the pairwise TR3 D networks to achieve the domain adaptation of spatial-temporal representations.The construction of pairwise discriminative loss function improves the discriminability of spatial-temporal features.Experimental results on the SDAI Action I and SDAI Action II datasets show that PASTN achieves better performance in both algorithm accuracy and model computing speed.
Keywords/Search Tags:Cross-domain action recognition, Transfer learning, Few-shot learning, Deep Learning, Cross-domain action datasets
PDF Full Text Request
Related items