Semantic segmentation is an important computer vision task aiming to predict the semantic class annotations for pixels in an image.With the development of deep convolution neural network,many excellent methods based on deep convolution neural network have emerged in semantic segmentation,making significant progress in real-world applications closely related to semantic segmentation tasks,such as autonomous driving,disease diagnosis,and image editing.However,these methods require the use of manually annotated datasets for model training,while collecting and annotating semantic segmentation datasets incurs high labor and time costs.In order to reduce the high cost of manually annotating datasets,the current popular method is to use computer-synthesized virtual images and automatically generated corresponding pixel-level labels for training.However,due to the discrepancy in data distribution between the synthetic datasets and the real-world datasets,the semantic segmentation model trained on the synthetic datasets cannot be well generalized to the real datasets.Recently,the unsupervised domain adaptation technology has received widespread attention.The domain adaptation aims to study how to use one or more datasets to train the model and make it perform well on the target datasets with less data,incomplete labeling information,and different distribution from the training dataset.The domain adaptation problem includes the source domain and the target domain.Generally,the dataset with rich data and complete annotation information is used as the source domain,and the dataset with less data and incomplete labeling information is used as the target domain.Despite the significant improvement in the segmentation performance driven by unsupervised domain adaptation methods,most methods only consider bridging the domain gap between the source and target domain,without considering the distribution differences in the target domain.However,the previous intra-domain adaptation method divides the target data into two subdomains based on the difficulty of sample segmentation,which does not sufficiently capture the distribution existing in the target domain.Based on the observation of different styles in the target domain samples,this thesis proposes an unsupervised domain adaptation method based on style clustering.Firstly,this thesis proposes to adopt style-based inter-domain adaptation which applies pixel-level adaptation and self-training simultaneously.Style-based inter-domain adaptation can reduce the discrepancy in style distribution between the source and the target domain and directly supervise the target domain style.Secondly,based on the observation that there exist diverse styles in the target samples which leads to the intra-domain gap,this thesis proposes to extract style features from target domain samples for clustering to divide the target domain into subdomains iteratively so as to capture the multiple style distributions in the target domain.Finally,this thesis proposes to apply multi-channel soft labels for adversarial training to align the distributions among the subdomains since the subdomain labels are unknown.In comparison with general intra-domain adaptation methods,the method proposed in this thesis can capture the latent distributions within the target data more sufficiently to close the intra-domain gap more effectively.The experimental results on two general unsupervised domain adaptive segmentation tasks,GTA5→Cityscapes and SYNTHIA→Cityscapes,show the effectiveness of the method. |