In recent decades,artificial intelligence technology has developed rapidly.It is considered to be the fourth industrial revolution.The development of various technologies such as 5G,storage,and big data has brought about a blowout of data.Abundant and diverse data resources provide artificial intelligence training systems.basis is provided.The image semantic segmentation studied in this thesis is an important branch in the field of computer vision.Specifically,its task is to segment different objects in an image from a pixel-level perspective,and label and classify each pixel in the original image.Simply put,in an image,the target is separated from the background.Semantic segmentation has been widely used in fields that need to extract specific contours of objects,such as autonomous driving,lesion segmentation,and so on.As the accuracy of deep learning has greatly surpassed the previous traditional methods,more and more excellent neural networks have appeared in segmentation tasks.However,due to the constraints of the characteristics of neural networks,the existing semantic segmentation based on convolutional neural networks The method still has some challenges in the face of semantic segmentation tasks in some complex scenes.For example,in application scenarios such as industrial defect detection and medical images,if an image contains multi-scale objects,it is generally difficult for existing convolutional network-based segmentation models to effectively segment them.At the same time,since fully supervised semantic segmentation needs to label each image pixel in the dataset,manual labeling and segmentation of image data is very time-consuming,which is far more complicated than dataset labels for other tasks such as image classification and target detection.Fully supervised semantic segmentation datasets are rather time-consuming.In summary,there are still many difficulties in semantic segmentation in the actual scene landing.It is of great significance to solve the above two difficulties.Therefore,the research direction of this thesis is divided into the following two points:(1)Aiming at the problem of loss of spatial information of small objects in medical images and defect detection,this thesis proposes an end-to-end training image segmentation algorithm combined with Compressed Spatial Attention Module(CSAM).CCAM can effectively solve the problem of large differences in object weights at different scales,monitor up-sampled features through down-sampled features,and focus attention on specific small objects to suppress irrelevant regions in the input image,thereby improving segmentation precision.For the two tasks of medical CT lesion segmentation and industrial defect segmentation,experiments are performed on the dataset The Liver Tumor Segmentation Benchmark(LiTS)and the laboratory project breast cancer dataset.In order to reflect the applicability of the algorithm,experiments were carried out on multiple 3D datasets,and the experimental results surpassed the mainstream algorithms.(2)Aiming at the problem of labeling troublesome in image segmentation data set and consuming a lot of labor time cost,this thesis studies and designs a weakly supervised algorithm,which is called Box CAM in this thesis.Box CAM improves the generation quality of pseudo masks by using a combination of image class labels and 3D bounding boxes,the former generates initial seed regions through Grad-CAM,and the latter generates coarse masks through 3D Grab Cut algorithm.At the same time,in order to fully refine the boundary quality of the pseudo mask,a voxel similarity loss function corresponding to the boundary features is also proposed.The experimental results show that each module has a certain improvement in performance. |