| Remote sensing images contain rich information of surface features,in which man-made structures,for instance,building,are up to 80%.Making good use of such images provides tremendous help in urban planning,geographic mapping,disaster relief and other fields.Research community puts a lot of effort into buildings extraction from remote sensing images and has achieved some results.However,existing works mainly focus on handcraft features such as color,texture and shape,which may not meet actual needs.Especially in residential areas,there are huge differences in building age,residence time and maintenance.Handcraft feature-based solution would fail due to distinguish surface covering,messy layout and diverse structures.Therefore,automatic building extraction still has broad research prospects and great application value.By investigating research status in this field at home and abroad,this paper proposes a deep learning-based automatic building extraction solution for remote sensing images with the characteristics of complex features,high resolution and large data volume.Images are preprocessed by Gaussian filter and histogram equalization,and then fed into proposed convolutional neural network namely ADS-Net,for training.ADS-Net consists of two branches,that is,an encoder-decoder network for semantic segmentation and an auxiliary network for object detection.Based on attention mechanism,a feature map fusion is applied between the encoder and decoder to enhance data interaction.The auxiliary network borrows the idea of multi-task learning to improve performance of building extraction by a building detection task.At last,a penalty term is added to the loss function to smooth out jagged edges in prediction results.It forces model to pay more attention to the building edge.In the experiment,three different data sources,remote sensing satellite and aerial image,are used.Experimental results show that proposed automatic building extraction solution is superior to classical semantic segmentation model U-Net.It achieves a steady improvement of 2%-4%on average accuracy among different data sizes,which has high practical value. |