Video Facial Expression Recognition Based On Deeply Compressed Spatiotemporal Model On Edge Devices

Posted on:2021-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:B Liu

Full Text:PDF

GTID:2428330611999324

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Recently,deep learning has attained remarkable achievements in the field of computer vision,speech processing,and natural language processing.Facial expression recognition(FER)is one of the most important applications in computer vision,which can be applied in many fields such as human-computer interaction,polygraph,and fatigue testing.Video facial expression recognition can be taken as FER with time-series data.It improves the accuracy of FER,but requires complicated neural networks to effectively capture essential features from video data.Therefore,the complicated model with redundant parameters contributes to a high demand for computation and storage resources.In this case,cloud computing can satisfy the requirements but usually with a high expense,while edge computing is a viable method on this regard.However,edge devices usually suffer from limited computation and storage resources which seriously hinders them from deep neural network applications.As a result,optimizations are crucial for deep learning-based applications on edge devices,especially for video data-based FER.In this dissertation,we propose a deeply compressed spatio-temporal model for FER with video data input,which is more applicable on edge devices compared to other deep learning-based methods with the developed optimization techniques.The whole system consists of two basic parts: a convolutional neural network-based feature extractor and a long short-term memory-based expression classifier.The feature extractor applies general matrix multiplication library and Winograd algorithm for an acceleration of inference,and the expression classifier employs tensorization as well as tensor train decomposition for compression to obtain further acceleration.In experiments on datasets of the Extended Cohn-Kanade dataset,Acted Facial Expression in Wild dataset 7.0 and MMI dataset,our proposed method achieves 97.96%,55.60%,and 97.33% classification accuracy separately,where the compression rate of tensorized spatiotemporal model achieves nearly 8.4%.Besides,it is deployed on various mobile platforms with specific computing units including ARM CPU,neural process unit,and Huawei Da-Vinci AI core.And in our evaluations of performance,our proposed framework attains high acceleration ratios up to 1.20,2.70 and 7.92 separately.

Keywords/Search Tags:

deep learning, edge computing, video facial expression recognition, tensor compression, spatio-temporal network model

PDF Full Text Request

Related items

1	Video Facial Emotion Recognition Based On Deep Learning
2	Video Facial Expression Recognition Based On Deep Learning
3	Research And Application Of Facial Expression Recognition Algorithm Based On Tensor Representation And Deep Learning
4	Research On The Cognitive-affective State Recognition Based On Facial Expression Spatio-temporal Features
5	Facial Expression Recognition Based On Deep Learning
6	Facial Expression Recognition Based On Spatial-temporal Representation And Deep Learning Model
7	Embedded Realization Of A Facial Expression Recognition System Based On Improved LeNet-5
8	Research On Facial Expression Recognition Method Based On Deep Learning
9	Tensor Representation And Decomposition Algorithm And Theory Research On 3D Facial Expression Recognition
10	Students’ Classroom Facial Expression Recognition And Intelligent Classroom Teaching Evaluation Based On Deep Learning