Font Size: a A A

Predicting The Appropriate Number Of Observers For Eye-Tracking Based Video Saliency Experiment

Posted on:2020-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:C C LiFull Text:PDF
GTID:2428330572967375Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Accurately predicting video saliency is important for applications such as video quality assessment,summary,compression,and retargeting.Recent study has shown that none of the existing models can compete with eye-tracking based model,even for single observers,so determining video saliency from human gaze data is a promising approach.Due to the differences in individual observers,eye-tracking data of a certain number of observers are usually required to compute a visual attention map close to the ground truth.However,it is not easy to invite a large number of observers because it is time consuming and expensive.To keep the balance between accuracy and expense,this paper proposes a new algorithm for suggesting the appropriate number of observers needed in eye-tracking experiment for a given video.Through carefully analyzing various types of video clips and the corresponding eye-tracking data,we find that the video contents greatly influence the needed observer number when computing video saliency.This paper first designs a multi-level feature model that focuses on three aspects:texture and motion feature computed from video frames,and saliency map based feature computed from saliency map.Then two algorithms are proposed for predicting the number of eye-tracking data required to calculate the video region of interest for a given accuracy.The first algorithm is a classification prediction model based on the support vector machine.With given accuracy threshold,the fixation consistency curves(multidimensional feature vectors)generated from the eye-tracking data are clustered,and the mappings between clustery categories and the number of eye-tracking data need is established.Furthermore,an SVM based classification model is constructed to describe the relationship between proposed multi-level feature and video category.The second algorithm is a multiple linear regression prediction model that could predict the appropriate number of observers needed in eye-tracking experiment.Given the accuracy threshold,the multi-level feature vector is used as the regression model argument,and the number of eye-tracking data required is the dependent variable from the regression model.To verify the effectiveness of the algorithm,the region of interest generated by the predicted number of eye-tracking data is used in video compression.Experimental results show that our algorithm outperforms that of direct compress in visual quality at the same bitrates.The main contributions of this paper are as follows:(1)A multi-level feature model describing the video is constructed,which including texture-based statistical features,continuous frame motion features,and salient features based on saliency regions.(2)In order to fitting the mapping relation between the video multi-level feature model and the number of required eye-tracking data under a given threshold precision,SVM classification prediction model and multiple linear regression prediction model are proposed.(3)The accuracy of generated region of interest using the corresponding number of eye-tracking data is measured by user specified accuracy thresholds.For any threshold of 0?1,the appropriate number of observers for predicting video saliency with eye-tracking data can be quickly suggested.
Keywords/Search Tags:video processing, visual saliency, eye-tracking data, fixation consistency, support vector machine, multiple linear regression
PDF Full Text Request
Related items