Font Size: a A A

Research On Image Content Understanding Based On Deep Multi-task Convolutional Neural Network

Posted on:2022-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShenFull Text:PDF
GTID:2518306314974319Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Multi-task learning based on Convolutional Neural Networks(CNNs)has achieved remarkable success in various applications of computer vision,and is one of the research directions of current focus.Multi-task convolutional neural network is to learn the shared representation of multiple tasks under the condition that the structure of single task model remains unchanged.The shared representation is applicable to different but related training objectives,so that the multi-task model has more generalization ability.Then each task branch fits its own output,so as to complete the joint prediction of multiple tasks and achieve the overall performance improvement.For effective multi-task convolutional neural network methods,recent studies are designed to automatically learn the optimal combination of the features of a single network layer of each task,so as to complete the innovation of multi-task learning structure.However,these methods do not consider the characteristics of input features of each branch,and usually learn a feature combination scheme with fixed parameters after model training.Therefore,this paper proposes a novel adaptive feature interaction layer for multi-task convolutional neural networks.In this layer,a dynamic interaction mechanism is designed to allow each task to adaptively determine the degree of knowledge sharing or retention among tasks.In the adaptive feature interaction layer,two types of feature interaction modules are introduced to realize the adaptive feature interaction by capturing the feature dependencies of different tasks in channel and spatial dimension respectively.The adaptive feature interaction layer is a plug-and-play component with low parameter count and computation overhead.In the case that the single-task learning structure remains unchanged,it is extended to the multi-task learning structure so as to achieve performance improvement.It is worth noting that the strategy of inter-task gradient balance is also the key to the study of multi-tasking learning.In fact,different tasks have different complexity and convergence rates,and if trained without any balance control,the gradient of multi-task structure can easily become dominated by the gradient of one task,at the expense of the performance of other tasks.Therefore,according to the difference of the magnitude of gradients among tasks in the structure of multi-task learning,this paper proposes a novel strategy of inter-task gradient balance(ReGrad)to ensure that multiple tasks can balance learning under the framework of unified learning and avoid the deviation of each task's learning directionMulti-task learning method is to explore the internal relationship between tasks,so as to obtain a more general shared representation,make the model more generalizable,and ultimately improve the performance of all tasks.Inspired by this,this paper uses multi-task convolutional neural network to explore the potential correlation between image aesthetic assessment and emotion analysis tasks.Aesthetic assessment and emotion analysis of images enable computers to recognize the aesthetic and emotional reactions of human beings which are stimulated by visual images respectively.In recent years,existing researches on image aesthetic assessment and emotion analysis mostly use convolutional neural network to automatically extract image features with good distinguishing ability.However,the current researches ignore the internal relationship between these tasks,and usually regard them as two independent tasks,which treat all kinds of hierarchical perception tasks in the field of image separately.Therefore,this paper completes the task of image aesthetic assessment and emotion analysis under a unified framework by adopting the method of multi-task learning,and explores the internal correlation between the tasks.After the method design was completed,a detailed ablation study was conducted to further understand the details and effects of the proposed method.At the same time,detailed and comprehensive performance comparisons with recent typical methods are also carried out in this paper,including experiments on pixel level and image level tasks,and various accurate and reasonable evaluation criterions.Experimental results demonstrate the effectiveness of our method.
Keywords/Search Tags:Deep Multi-task Convolutional Neural Network, Adaptive Feature Interactive Network, Aesthetic Assessment, Emotion Analysis, Inter-task Gradient Balancing Strategy
PDF Full Text Request
Related items