| The remote sensing scene images have the characteristics of high within-class diversities and low between-class similarities,because of various objects with different colors,viewpoints,poses and spatial resolutions in the scene image.At the same time,there are multiple sub-concepts associated with semantic category information.The state-of-the-arts methods based on deep learning are difficult to represent image visual information in detail,which makes remote sensing image scene classification extremely challenging.For the above problems,the multiple instance learning framework is introduced to research remote sensing scene classification methods based on multiple instance network from two aspects of image features and local regions.On the one hand,the complexity of the scene images and the objects contained by themselves have rich attributes,while the single global feature extracted by the convolutional neural network has limitations.A remote sensing scene classification based on multi-branch instance network fusion is proposed to learn adequate instances representations that are rich enough to distinguish different semantic categories.First,object segmentation network and attention network are proposed to collect instance features including global object information and key regions respectively.Then,the instance networks of the above two branches are fused with the original input branch to consistently and effectively optimize the final classification results.On the other hand,due to the confusing local regions that are not related to semantic categories for the scene images,and the learning of multiple interrelated sub-concepts that are ignored,this thesis further proposes a remote sensing scene classification based on weakly supervised multiple instance sub-concept learning.First,the local instance regions with latent semantic information are located based on the weakly supervised localization network,and the features of instance regions can be automatically intercepted.Secondly,a sub-concept layer is added to the multiple instance network by mining the relationship between sub-concepts and instances to build the relationship between the instances and the label.Finally,the two loss functions are combined to jointly end-to-end optimization training for remote sensing scene classification.The method in this thesis is mainly verified on two public large-scale scene datasets AID and NWPU-RESISC45.Compared with the state-of-the-arts methods,our method effectively improves the classification performance of remote sensing scenes.In addition,we also test the natural scene datasets CIFAR-10 and CIFAR-100,which achieves better performance. |