Research On Image Pixel-level Multi-visual Task Learning

Posted on:2023-11-29

Degree:Master

Type:Thesis

Country:China

Candidate:J Ye

Full Text:PDF

GTID:2568306788955239

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Multi-Task Learning(MTL)is a strategy for processing and leveraging relevant information between different tasks,which trains multiple related tasks to obtain a model with better generalization performance.In the field of deep learning,MTL often refers to designing a network that can learn shared representations from multitasking supervisory signals.With the continuous research and exploration of many researchers,the application effect of multi-task learning research in related visual tasks such as Edge Detection,Semantic Segmentation,Depth Estimation,Object Detection,and Image Classification has been continuously improved.However,due to the limited computational cost,there is still a great deal of research space in the design of the network structure,the balance of the loss function,and the selection of shared features of the task.In this paper,From the three aspects of shared features,shared prediction results and function optimization,this paper reviews the image Multi-Task Learning method based on Deep Learning.Finally,the development of image Multi-Task Learning in the fields of Multi-Domain Learning,Transfer Learning,Medical Imaging,and Semi-Supervised fields is introduced.Multi-task Learning and deep network structure design are studied in Semantic Segmentation,Depth Estimation and other pixel-level prediction tasks in image scene resolution,and the contributions and innovation points of this paper are roughly as follows:(1)For the two tasks of Semantic Segmentation and Depth Estimation in scene analysis,from the perspectives of shared feature learning and feature interaction fusion,two different multi-task learning architectures are proposed to jointly learn Semantic Segmentation and Depth Estimation,which effectively realizes Cross-task feature interaction and further improves the performance of semantic segmentation and depth estimation;(2)In order to improve the feature extraction capability of the network,This paper proposes a Multi-task Learning Network MTL＿ASP based on Selective Weights,and uses selectable weight modules to perform hierarchical feature fusion between task-specific subnets composed of Self-Attention,collaboratively optimizes semantic features and depth features,and uses the void pyramid pooling module to obtain denser task feature information,so as to obtain higher accuracy with lower computational complexity;(3)Aiming at the problems of difficult recognition of small objects and misidentification of large objects in the current Semantic Segmentation task,the multi-scale Self-Attention module combined with the Atrous space pyramid pooling module is used to obtain the characteristics with global context,and the Semantic Segmentation,Saliency Object Detection,and Surface Normal Detection are combined to carry out Multi-Task Learning,which improves the generalization ability of the model.

Keywords/Search Tags:

Multi-task Learning, Semantic Segmentation, Depth Estimation, Selective Weights, Self-Attention Module, Cross-task feature interaction

PDF Full Text Request

Related items

1	Multi-task Semantic Segmentation Method Based On Attention And Feature Fusion
2	Research On Monocular Depth Estimation By Multi-task Learning
3	Research On Sample Representation And Cross-Domain Learning Method For Soft Biometric Estimation Task
4	Deep Learning Based Monocular Scene Depth Estimation Algorithm
5	Semantic-aware Self-Supervised Depth Estimation On Monocular Image Sequences
6	Research And Implementation On Multi-Task Learning For Image Depth And Spectra Information Recovery
7	Research On Image Segmentation Based On Multi-task Learning Deep Neural Networks
8	Research On Key Technologies Of Monocular SLAM Based On Deep Learning Method
9	Research On Few-shot Image Semantic Segmentation Algorithm Based On Multi-Task Learning
10	Age Estimation Method Of Face Images Based On Multi-task Learning