Font Size: a A A

Towards Local Semantic Representation: Remote Sensing Scene Classification Based Deep Learning Approaches

Posted on:2021-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q BiFull Text:PDF
GTID:2492306293953019Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
Since the 21st century,the number of high-resolution remote sensing satellites has increased significantly,and it is no longer difficult to acquire massive high-resolution remote sensing images.The major difficulty of remote sensing image interpretation has also changed from how to obtain a large number of high-quality remote sensing imagery in a relatively short time to how to efficiently extract information from massive remote sensing imagery efficiently.Remote sensing image scene classification is an important research field in intelligent information extraction of remote sensing images.Effective remote sensing scene classification approaches can be applied to a series of remote sensing applications such as land resource planning,thematic mapping and military reconnaissance.Since 2012,under the impact of a new wave of artificial intelligence,deep learning approaches especially convolutional neural networks(CNNs)have made rapid progress in the field of image recognition,providing a feasible solution for the intelligent information extraction of remote sensing image.However,compared to natural image scenes,in remote sensing scenes,the spatial arrangement is more complicated and the object distribution is more diverse.Current convolutional neural networks usually extract global semantic information,which is not enough to effectively describe complicated remote sensing scenes.Enhancing the local semantic description capability of deep learning models can highlight the feature responses of many important local regions in remote sensing scenes,and thus can provide the possibility of improving the capability of deep learning models to understand remote sensing scenes.In this thesis,with the goal of enhancing the local semantic representation capability for deep learning,after establishing a local semantic description mathematic model for deep learning,three deep learning remote sensing scene classification approaches for local semantic expression are studied.The major work of this thesis is summarized as follows.(1)A deep learning remote sensing scene classification approach fusing important local semantics and convolution featuresCurrent attention modules can perceive important local regions in the scene that can often determine the global semantic label.But this process of extracting top-down salient features interrupts the process of extracting bottom-up convolutional features in a convolutional neural network.Thus,the capability of current attention modules to improve the local semantic representation still needs to be improved.In this thesis,a residual connected spatial attention module is proposed to fuse the above two kinds of features.At the same time,a simplified dense connection structure is utilized to enhance the capability of feature propagation.The above components form a novel residual attention densely connected convolutional neural network(RADC-Net)as a whole.The model can effectively preserve the low-and middle-level features in a deep learning model,and can fuse important local semantic features and bottom-up convolution features in a trainable manner.(2)A multi-scale feature stacking attention pooling approach for remote sensing scene local semantic representationExisting pooling operations in deep learning approaches usually use average or maximum pooling,which are pre-defined and not trainable.It is difficult to effectively highlight a series of important local semantic information in remote sensing scenes.In response to this problem,this paper introduces a weighted average template in the pooling operation in order to assign higher weight responses to important local regions.To make the whole weighted average pooling trained and optimized along with the deep learning model,we utilize an attention module to extract the spatial weight template for the weighted average pooling operation.On the other hand,to extract more discriminative convolutional features,a series of different-sized dilated convolutions are utilized to extract multi-scale convolution features.The whole framework is organized as a whole and it can be embedded in any current convolutional neural networks to enhance the local semantic representation capability.(3)A deep multiple instance learning approach for remote sensing scene classificationIn order to further explore the relationship between the global semantic label of a scene and the local semantic of each image patch,and to highlight the important local semantic patches,this paper introduces a multiple instance learning process in current deep learning models to enhance the local semantic representation capability.However,current instance aggregation functions in multiple instance learning are also pre-defined and untrainable,which cannot highlight the contribution and role of important instances for the global scene semantic label.Therefore,this thesis utilizes the weights extracted by a spatial attention module and a"channel-space"attention module to describe the contribution of different instances in the scene.The multiple instance learning module and the dense connection structure studied in this thesis form a light-weight multi-instance convolutional neural network named MIDC-Net to improve the local semantic representation capability for remote sensing scenes.The above remote sensing scene classification approaches are validated on three benchmarks,namely UCM,AID and NWPU.They all outperform current baseline approaches and some state-of-the-art approaches.These approaches can effectively highlight the feature reponse of the key local regions which are strongly related to the scene semantic label.They provide a series of possible new solutions for deep learning based remote sensing image processing.In addition,these approaches also achieve an impressive performance on two remote sensing scene classification applications,that is,Xinjiang residential scene recognition and Small habor recognition on Yangzi River,which reflects the great potential of our approaches to be utilized on remote sensing applications.
Keywords/Search Tags:remote sensing scene classification, local semantic representation, attention module, trainable pooling, deep multiple instance learning, densely connected Conv Net
PDF Full Text Request
Related items