Multi-Task Learning(MTL)is a strategy for processing and leveraging relevant information between different tasks,which trains multiple related tasks to obtain a model with better generalization performance.In the field of deep learning,MTL often refers to designing a network that can learn shared representations from multitasking supervisory signals.With the continuous research and exploration of many researchers,the application effect of multi-task learning research in related visual tasks such as Edge Detection,Semantic Segmentation,Depth Estimation,Object Detection,and Image Classification has been continuously improved.However,due to the limited computational cost,there is still a great deal of research space in the design of the network structure,the balance of the loss function,and the selection of shared features of the task.In this paper,From the three aspects of shared features,shared prediction results and function optimization,this paper reviews the image Multi-Task Learning method based on Deep Learning.Finally,the development of image Multi-Task Learning in the fields of Multi-Domain Learning,Transfer Learning,Medical Imaging,and Semi-Supervised fields is introduced.Multi-task Learning and deep network structure design are studied in Semantic Segmentation,Depth Estimation and other pixel-level prediction tasks in image scene resolution,and the contributions and innovation points of this paper are roughly as follows:(1)For the two tasks of Semantic Segmentation and Depth Estimation in scene analysis,from the perspectives of shared feature learning and feature interaction fusion,two different multi-task learning architectures are proposed to jointly learn Semantic Segmentation and Depth Estimation,which effectively realizes Cross-task feature interaction and further improves the performance of semantic segmentation and depth estimation;(2)In order to improve the feature extraction capability of the network,This paper proposes a Multi-task Learning Network MTL_ASP based on Selective Weights,and uses selectable weight modules to perform hierarchical feature fusion between task-specific subnets composed of Self-Attention,collaboratively optimizes semantic features and depth features,and uses the void pyramid pooling module to obtain denser task feature information,so as to obtain higher accuracy with lower computational complexity;(3)Aiming at the problems of difficult recognition of small objects and misidentification of large objects in the current Semantic Segmentation task,the multi-scale Self-Attention module combined with the Atrous space pyramid pooling module is used to obtain the characteristics with global context,and the Semantic Segmentation,Saliency Object Detection,and Surface Normal Detection are combined to carry out Multi-Task Learning,which improves the generalization ability of the model. |