Font Size: a A A

Research On Visual Focus Of Attention Based On Feature Fusion

Posted on:2018-06-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X M WangFull Text:PDF
GTID:1318330518468936Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual focus of attention is one of the important research contents in the field of computer vision.It refers to the use of pattern recognition and machine learning to predict the target or direction of the experimental object.Feature fusion based visual focus of attention algorithm means to construct the head feature matrix by feature extraction and fusion,calculate the head pose information or gaze direction,and finally determine the target or direction of visual attention.In recent years,visual focus of attention algorithm has been widely used in public safety,natural meetings,auxiliary driving and many other fields.Although a large number of researchers have done a lot of research on visual attention algorithms based on facial features,there are still many problems,mainly in three aspects:(1)The imbalance between local feature and the global feature.In general,the feature fusion method selects the weighted features of the whole image,and only considers the global validity of the feature fusion,which is not sufficient to express the local feature,or only consider the local features,and extract the local features by various methods,which lead to high complexity of global feature expression.The characteristics of different regions of the same image are different,and the characteristics of the global feature can easily lead to the high computational complexity,and the extraction of a small number of features will cause the local feature information to be insufficiently expressed.In order to extract the full local feature as much as possible and reduce the computational complexity of the global feature,we need to consider the effective expression of the local feature and the global feature.(2)The low computational efficiency and high complexity representation of head pose information.Head pose is the core component of visual attention technology.Accurate head pose estimation can efficiently drive visual attention to target prediction and tracking.The head pose estimation method includes three categories,appearance based model based,geometric based model and feature expression based model,respectively.The method based on the feature expression is easy to be disturbed by the external environment,such as head accessories,head position changes.Additionally,appearance based model need to train a large number of head data,and need posture information in the training samples for accurate labeling.And the geometry based methods are processing on real-time,however,are strickted by the camera calibration parameters and image resolution.Besides,a single camera can not get the depth of information,even if the accuracy rate reached the pixel level,there is still about 5 angle error in degree.In order to accurately express and efficiently calculate the head pose,it is necessary to construct a simple and concise head posture feature matrix and head attitude calculation method.(3)The ambiguity between head pose and gaze direction in visual focus of attetion.Gaze direction and head pose are two core research in visual focus of attention.the two complement each other,indispensably.A single head gesture or gaze direction can not accurately express the person's visual focus.There is a number of potential concerns in the same head orientation,and the visual focus needs to be locked in order to accurately lock the visual attention target or direction.In addition,there is a gaze shift in the condition of the head orientation,that is,the visual attention target is changing.At present,the study of visual attention is often focused on the two aspects of head pose analysis or gaze direction estimation,and did not achieve the purpose of alleviating the ambiguity of head orientation and gaze.Therefore,it is urgent to construct the visual attention algorithm which can alleviate the ambiguity of the head orientation and gaze direction.Due to the imbalance of local features and global features,the complexity of head pose expression and the low timeliness of calculation,the ambiguity of head orientation and gaze direction,the visual focus of attention algorithm based on feature fusion is still a difficult and challenging topic.In view of the above problems,this paper has carried on the following three aspects of innovation work.(1)Fully express local information and reduce the complexity of global features.In order to improve the adequacy of local feature expression and reduce the complexity of global feature expression,the local feature and global feature balance are achieved.In this paper,a local feature extraction framework based on information entropy is proposed,named the weighted entropy fusion of Gabor and Phase Congruency header feature matrices.First,according to the information entropy theory to measure the importance of local information of the image,to determine what kind of features can fully express the original information of each region;and then all the local features can form a global feature matrix in a concise way;Finally,to evaluate the superiority,the proposed algorithm is validated through the public face dataset and the head dataset by using the machine learning classifier and the regression device.(2)Accurate expression and efficient calculation of the head pose.In order to improve the accuracy of head pose expression and improve the timeliness of head posture calculation,this paper proposes a head pose estimation algorithm based on depth information reconstruction and an respective improved weighted version.Firstly,the LBP feature of the head is used to construct the Adaboost-LBP face classifier,and then the image depth is reconstructed according to the camera parameters.Based on the geometric relation between the target and the camera,the depth information can be reconstrcuted,and head pose estimation algorithm based on the depth information is proposed.In order to improve the accuracy of constructing the depth information,ASM is used to extract the 68-point face contour model,and the weighted depth information reconstruction algorithm is constructed.Finally,the depth information is combined with the head pose by the appearance model.The pose estimation algorithm based on the depth information reconstruction and the weighted version are better than the commonly used head pose estimation method in both the accuracy and the calculation performance.(3)The ambiguity between head orientation and gaze direction.The field of visual focus of attention includes both the head pose and the gaze direction.A single head gesture can correspond to multiple gaze directions,and the same gaze direction can also be in different head posture conditions.Therefore,the use of head gestures or gaze direction to describe visual concerns will produce ambiguity.In order to alleviate the ambiguity of the head poose and gaze direction in the field of visual attention,this paper presents a visual attention algorithm,combined with a gaze-assisted hidden Markov model.Firstly,the head data is calculated by the deep convolution neural network VGG and the head pose and gaze direction are calculated by some classifiers.Then,the gaze direction and the head pose are combined to predict the visual attention direction or target by the hidden Markov model.Finally,And the real-time video data is analyzed.It is shown that proposed method of visual focus of attention can weaken the visual ambiguity and improve the accuracy of visual attention target prediction to a certain extent.Through the analysis of the Homogeneous and heterogeneous data from public image dataset and video dataset,the following conclusions are drawn:(1)Using the weighted information entropy feature fusion framework to fuse the Gabor feature and Phase Congruency feature,which construct the head pose feature matrix not only express the local features of the head sufficiently,but also reduce the complexity of the global feature.The global and local balance improves the classification performance of the head pose estimation algorithm.(2)Based on the depth information reconstruction of the head posture estimation algorithm and improved weighted version,the depth information is accurately reconstructed,which simplify the head posture expression and improve estimation prescription.(3)The proposed gaze assistance visual focus of attention algorithm combined head pose with gaze direction by hidden Markov model to predict the visual focus direction or target,which alleviates the ambiguity of the head orientation and gaze in the visual focus of attention algorithm,and reduces the error of visual attention.
Keywords/Search Tags:visual focus of attention, feature fusion, information entropy, depth information reconstruction, gaze assistance
PDF Full Text Request
Related items