Research On Aesthetics-based Image Spatio-temporal Visual Attention

Posted on:2021-05-26

Degree:Master

Type:Thesis

Country:China

Candidate:J C Lv

Full Text:PDF

GTID:2518306548981709

Subject:Electronics and Communications Engineering

Abstract/Summary:

With the rise of the mobile Internet and the emergence of a large number of images,how to extract the value in images has attracted much attention.The visual attention mechanism helps people solve this problem.Previous researches on visual attention corresponded to static and dynamic attention,where spatial saliency map models and temporal scanpath models were proposed,respectively.Nevertheless,previous work on visual attention ignored human sentiment,while human aesthetic sentiment is closely related to visual attention.Therefore,this thesis proposes an aesthetics-based spatiotemporal visual attention prediction algorithm,and proposes the following two methods towards image saliency map prediction and scanpath prediction respectively:1.An aesthetics-based multi-task saliency map prediction method.This thesis introduces aesthetics context for visual attention prediction for the first time.To this end,an encoder-decoder model based on multi-task convolutional neural networks is designed.A color image is input into the shared backbone network encoder,and then passed through the corresponding aesthetics decoder or saliency decoder according to the data source to calculate the loss and update the model parameters by back propagation.In this thesis,three different popular backbones are used as the encoders for parameters sharing of both aesthetics task and saliency map task.Experimental results show that the aesthetics-based multi-task model can generate accurate saliency maps,which confirms the positive effect of aesthetic sentiment on visual attention prediction,and being superior to most existing saliency models.2.A scanpath prediction method based on coarse-to-fine networks.Our aestheticsbased saliency map prediction algorithm provides an effective solution for spatial attention.Therefore,the aesthetics-based saliency features from the spatial attention model is extracted and coarse-to-fine networks are designed to solve the temporal attention problem.The coarse-to-fine networks are composed of convolutional neural networks and LSTM networks.In order to achieve accurate prediction of scanpaths,this thesis proposes Inhibition of Return Attention(IRA)mechanism inspired by the Inhibition of Return(IOR)mechanism and input the IRA sequence to the fine LSTM networks at different time steps.Then the newly scanpath loss is proposed to train the entire model.Experimental results prove that the effectiveness of IRA mechanism and scanpath loss,and are superior to most existing scanpath prediction algorithms.

Keywords/Search Tags:

Visual attention, Saliency map prediction, Scanpath, Convolutional neural networks, LSTM networks

Related items

1	Research On Human Scanpath Based On Convolutional Neural Networks
2	Action Recogniton Based On Deep Neural Networks With Visual Attention Mechanism
3	Saliency Prediction Based On Lightweight Attention Mechanism
4	Video Saliency Detection Based On Improved Attention Network And Data Augmentation
5	Visual Saliency Computation In Panoramic Contents
6	Research On Visual Tracking Methods Based On Object Representation Enhancement
7	Research Of Image Group Scanpath Generation And Prediction In Natural Scene
8	Salient Object Detection With Visual Attention Based Convolutional Neural Networks In Dynamic Scene
9	Research On Key Technologies Of Image Saliency Detection Based On Deep Neural Networks
10	Research On The Visual Attention Mechanism For 3D Scenes