Font Size: a A A

The Research On Robust Spatial-temporal Co-occurrence Feature Extraction Algorithm For Facial Action Unit Detection

Posted on:2021-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhongFull Text:PDF
GTID:2428330614972613Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Action Unit Detection aims to make the computer detect the AU target automatically,to assist facial expression recognition and sentiment analysis.Relevant research has important research significance and extensive application value.In this paper,three robust AU detection algorithms are proposed and implemented respectively,based on the semantic symbiosis of facial AU in the spatial and temporal dimensions.The main research work of this article is summarized as follows:(1)An Attention-Enhanced Network is proposed.Firstly,multi-task cascade network is used to perform face alignment processing on the image to obtain robust landmarks.Next,AE-Net uses facial prior knowledge to predefine landmark-based attention maps.Since the human face is a structured image,the model uses an attention mechanism to enhance the ROI information and suppresses the uncorrelated information about AU detection.The experimental results of BP4 D and DISFA data sets verify the robustness of the extracted features of the AE-Net.Compared with the Fine-tune VGG,the F1 index of AE-Net increased by 8.1% and 4.4% on the BP4 D dataset and the DISFA dataset respectively.(2)In this chapter,a Multi-Scale Region Learning Network is proposed.The normal region learning layer has a single region division method,and has limited ability to process texture information of AU associated regions of different sizes.However,the proposed network fully integrates the complete global information of the original feature map and the detailed information of the related region which are extracted by the region learning layer.The proposed model guarantees the completeness and validity of the features.And we also set an adaptive threshold to make it robust to differences in people's appearance,different lighting conditions and other factors.The experimental results of BP4 D and DISFA data sets demonstrate the effectiveness of the proposed model.Compared to the DSIN algorithm,the F1 index of MRL-Net increased by 2.6% and 5.8% on the BP4 D dataset and the DISFA dataset respectively.(3)In this chapter,a Spatial-Temporal Co-occurrence Representation Network is proposed.The model includes two modules: spatial adaptive attention dual-stream fusion network and spatial-temporal co-occurrence feature extraction network.The spatial attention model combines the global attention features and the local adaptive attention features,so that the model can not only focus on the key area information of the entire face,but also capture the detailed information of the highly correlated areas,to fully mine the spatial semantic correlation between AU labels.The spatial-temporal co-occurrence feature extraction network uses the spatial feature information as an input to connect the LSTM to build a temporal model of AU.Compared to the ARL algorithm,the F1 index of STCR-Net increased by 13.2% and 5.6% on the BP4 D dataset and the DISFA dataset respectively.
Keywords/Search Tags:Facial AU detection, Multi-scale, Region learning, Adaptive attention map, Spatial-Temporal Co-occurrence
PDF Full Text Request
Related items