Font Size: a A A

PolSAR Classification And Video Action Recognition Based On Deep Learning

Posted on:2021-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:G G WangFull Text:PDF
GTID:2518306050471664Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the great progress of remote sensing technology and the popularization of mobile terminals,the scale of image and video data shows explosive growth.Pol SAR image automatic interpretation and video real-time action recognition attract more and more academia and industry attention.By the successive breakthrough of deep convolution neural network(CNN)in image classification and other track,the algorithms of deep learning are introduced into the task of Pol SAR image and video understanding.But,the data of Pol SAR and video is different from the standard optical image,it is always unable to achieve ideal results by directly applying CNN model to this type of data.Thus,there are three neural networks are designed to solve the problem of Pol SAR image classification and video action recognition,and the performance of the proposed algorithms is verified by using public data sets on the corresponding tasks.The key contents are summarized as follows: 1)A complex-valued convolution autoencoder(CV-CAE)and spatial pixel-squares refinement(SPF)is proposed.The input image blocks of CV-CAE are cut from the original Pol SAR.However,the traditional pixel-wise segmentation method is time-consuming,the method of constructing a segmentation matrix is used to quickly segment the train and test samples in this model.The complex values in the covariance matrix are adopted as the input of CV-CAE.The reconstruction loss is employed to extract features with unsupervised method in CV-CAE.Thus,this model reduces the dependence on labeled data and also keep the integrity of complex characteristics in Pol SAR.According to the block structure of Pol SAR,SPF refines one block in the preliminary classification result by using voting first and then calculating the difference of pixels of each class.Compared with other algorithms in similar tasks,CV-CAE and SPF obtain higher classification accuracy with less labeled data.2)A motion vector generation(GMV-Net)is proposed.GMV-Net is introduced to alleviate the problem that low-solution motion vector(MV)in compressed video is affected by noise and could not extract rich temporal discrimination features.MV,which contains motion information,extracted from compressed video directly as input samples,and the data processing is time-saving.Then,the structure of generation part of this model is encoderdecoder with skip connection.Thus,the features of each layer in this model are utilized to reconstruct the input samples and also refine the input data.Based on the deep separable convolution neural network,the number of layers and feature channels of this network is less than other similar models.Compared with other networks in action recognition,the amount of parameter is greatly reduced,which helps GMV-Net to converge faster,and this model also perform the better real-time performance.The real-time and effectiveness of GMV-Net is verified in public data sets.3)A new model to extracted long-term temporal sequence features(LTS)is proposed.The position encoding in LTS model is residual frame(R-frame)vector.In compressed video,Rframe vector is extracted with CNN model from residual frame.The R-frame represents the difference between current and the predict frame,it contains most of the edge information of each frame,so the corresponding frame is uniquely identified with it.The enhanced motion vector(EMV)contains richer temporal information.Thus,the input of LTS is EMV vector to obtain long-term temporal sequence features.Then,multi-head self-attention mechanism is adopted in LTS,which helps to obtain the correlation between current frame and other frames in the whole video.It is gainful to find the temporal relationship with each frame while reducing the impact of redundant information.Experiments show that LTS obtains high recognition accuracy and achieves real-time processing videos.
Keywords/Search Tags:PolSAR image classification, action recognition, convolution neural network, deep learning
PDF Full Text Request
Related items