Font Size: a A A

A Network Model For Image Sentiment Classification With Joint Visual Saliency

Posted on:2022-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:M Y ZhaoFull Text:PDF
GTID:2518306539952859Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Psychological research has shown that image stimuli can evoke human's different emotional responses.The task of image emotion classification is to use machine learning models to automatically predict the observer's emotional reflection when seeing images,and to build image emotion automatic prediction models in social networks,interactive advertising and other scenarios.It has important application value.Existing studies show that comparing with the entire image,some local areas of the image are more likely to cause human emotional response,and the attention mechanism can effectively learn the key areas associated with the task in the image.For this reason,this thesis proposes an image emotion classification network model with joint visual saliency.The specific work includes two aspects:(1)Visual saliency prediction using multi-scale attention gated network: At present,saliency prediction models based on deep learning often only emphasize high-level semantic features,but high-level semantic features lack fine spatial information.Ideally,the saliency prediction model should include both spatial and semantic features.This thesis proposes a deep network model with a multi-scale attention gating module for visual saliency prediction.The network uses the high-resolution network(HRNet)as the backbone to extract multi-scale semantic features,and the multi-scale attention gating module adaptively fuses these multiscale features in a hierarchical manner.This module calculates the spatial attention map based on high-level semantic features,and then merges it with low-level spatial features through gating operations.Through hierarchical gated fusion,the final saliency prediction can be achieved on the finest scale.A large number of experimental analyses on three benchmark datasets have proved the superior performance of this method.(2)Image sentiment classification network model combined with saliency regions:Existing studies have found that certain local areas of an image are more likely to cause human emotional responses,and saliency prediction can provide effective local information.For this reason,this thesis proposes an image emotion classification network model that combines saliency regions.The network includes a saliency path and an emotion classification path.The saliency path predicts the salient areas in the image that can cause the viewer's emotional response.The emotion classification path extracts the global depth features of the image,and highlights the classification path through the residual attention fusion method.The image emotion category is finally output through the fully connected layer,and then end-to-end image emotion classification is realized.The experimental results show that the effectiveness of the model in this thesis is verified.While improving the accuracy of image emotion prediction,the predicted saliency area can also well match the artificially labeled emotion area.
Keywords/Search Tags:saliency prediction, image sentiment classification, visual attention, deep learning
PDF Full Text Request
Related items