Font Size: a A A

Research On Image Semantic Segmentation Based On Deep Learning

Posted on:2022-10-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z S JiaFull Text:PDF
GTID:1488306617498124Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation has always been the focus of research in the field of computer vision,and deep learning is currently a hot spot in the rapid development of artificial intelligence.Therefore,the application of machine vision based on deep learning has become one of the research hotspots in academia and industry.At present,most applications are based on precise and effective segmentation mechanisms,such as autonomous driving,medical analysis,scene understanding,virtual reality,and augmented reality.As a high-performance method of computer vision,deep learning has largely solved the needs of tasks including semantic segmentation or scene understanding in academic research or market applications.Our research aims to study several applications of image semantic segmentation technology based on deep learning in the field of computer vision.The specific work is as follow:Due to the problem of the anatomical variation of spine pathology,the noise caused by implants,and the difference in the range of different receptive fields,this paper proposes a method for Vertebrae localization and segmentation based on multiple U-Net fusion.This method uses a coarse-to-fine segmentation strategy to solve the challenging problem of simultaneously segmenting and labeling vertebrae in the case of highly repetitive structures.Firstly,we use the improved U-Net to locate the rough position of the spine in the original image.Then,we use the spatial configuration network(SC-Net)to use heat map regression to locate and identify the vertebrae.Finally,we design a U-Net with high resolution to perform binary segmentation for each identified vertebra,and merge the single prediction result into the final multi-label vertebra segmentation.The experimental results in the MICCAI2019 Large-scale Vertebral Segmentation Challenge(Ver Se 2019)show that a larger part of the proposed method has better segmentation performance.Due to the problem of different data sets having their own inherent distribution,this paper proposes a cross-dataset collaborative learning method that can learn from multiple datasets.Given multiple labeled datasets,this method improves the generalization and distinguishing capabilities of the feature representation on each data set.Firstly,we use a series of dataset perception blocks as the basic computing unit of the network to capture the homogeneous representation and heterogeneous statistics of different datasets.Then,we design a dataset alternate training mechanism to effectively promote the optimization process.Finally,the performance evaluation of four different public datasets,including Cityscapes,BDD100 K,Cam Vid,and COCO Stuff,are conducted under different scenarios of a single dataset and across datasets to verify the effectiveness of the method.The experimental results show that compared with the previous training methods of single dataset and cross dataset,this method has achieved significant improvement without introducing additional triggers,especially under the same PSPNet(Res Net-18)architecture.The m Io U value of the proposed method on the validation sets of Cityscapes,BDD100 K and Cam Vid is significantly better than that of a single dataset.Due to the problem of adaptive semantic segmentation in multi-source context,this paper proposes a multi-source adaptive semantic segmentation method based on collaborative learning.This method uses two collaborative learning strategies to explore basic semantic contexts across different domains and domain-invariant semantic contexts.Firstly,we use a simple image translation method to align the distribution of pixel values to reduce the difference between the source domain and the target domain to a certain extent.Then,a domain-adaptive collaborative learning method is designed to make full use of the basic semantic information of crosssource domains without viewing the target domain data set.Finally,by using the online pseudo-labels generated by the integrated model,the output of multiple adaptive models is limited,and the unlabeled target domain data is used to further improve the performance of domain adaptation.The comparison on the public data set and the results of ablation experiments show that the proposed method has a significant performance improvement compared with other state-of-the-art singlesource and multi-source unsupervised domain adaptive methods.Due to the problem of the independence of the panoramic segmentation network framework and the particularity of radar(Li DAR)point cloud data,this paper proposes a Li DAR point cloud panoptic segmentation method based on Polar Coordinates Bird's Eye View(BEV).This method uses a polar BEV to learn semantic segmentation and clustering of class-agnostic instances in a single reasoning network to avoid the problem of occlusion between instances in urban street scenes.Firstly,we create a fixed-size polar BEV code through projection and quantization operations,which is used to process point cloud data containing random sizes.Secondly,we use a U-Net containing 4 encoding layers and 4 decoding layers as the basic network,and segment the encoded Li DAR point cloud data semantically according to Polar Net.Then,on the basis of semantic segmentation,we construct a non-proposal Li DAR panoramic segmentation network to achieve instance clustering segmentation effectively.Finally,we design an improved instance expansion technique and adversarial point cloud pruning method to improve the learnability of the network.The experimental results on the Semantic KITTI and Nu Scenes datasets show that the proposed method has better performance than the baseline method under the premise of ensuring the real-time prediction speed.
Keywords/Search Tags:image semantic segmentation, multi U-Net fusion, cross-dataset collaborative learning, multi-source domain unsupervised adaptation, panoptic segmentation
PDF Full Text Request
Related items