| In recent years,online classes have been sought after by scholars with the advantages of controllable learning time,optional learning content and flexible learning location.The outbreak of COVID-19 in early 2020 has also brought unprecedented development opportunities to online education.However,while the online classroom is popular,it also gradually reveals its shortcomings.One of the biggest shortcomings is that teachers can not monitor learners’online learning status in real time,so they can not understand learners’classroom concentration,which also leads to the phenomenon that many online classes have a high registration rate but a low pass rate.Aiming at the current situation of difficulty in online classroom focus recognition,this paper aims to collect learners’learning data in the classroom only through the multimedia devices used by learners in online classroom with their own cameras without the help of special wearable devices.This paper uses multiple features of learners to recognize their classroom focus.First,the online classroom focus is recognized through the learners’facial expression features,then the online classroom focus is recognized through the learners’head posture features,and finally the online classroom focus is recognized by integrating the facial expression features and head posture features.The main research contents are as follows:(1)Establish online classroom face image dataset.Due to the small number of public datasets of online classroom concentration and less relevant recognition research,this paper constructs an online classroom face image dataset based on multiple features in the real network environment,with a total of 3025 images.In addition,this paper carries out the multi label task of online classroom concentration and online classroom emotion on the dataset,that is,each image has two kinds of labels.(2)An online classroom concentration recognition model based on facial expression features is proposed.Firstly,in terms of learners’facial expression feature extraction,this paper uses two feature extraction methods.The first method is Gabor feature extraction of facial expression based on ROI automatic segmentation.The second method is facial expression feature extraction based on multi convolution neural network,and classifies online classroom emotions through SVM classifier;Secondly,through the hierarchical measurement relationship between online classroom emotion and online classroom concentration,learners’online classroom concentration is recognized on the self-built online classroom face image dataset;Finally,experiments show that the facial expression feature extraction method based on multi convolution neural network has a high accuracy in the recognition of online classroom concentration,reaching 76.2%.(3)An online classroom concentration recognition model based on head pose feature is proposed.Firstly,the learner’s head pose features are extracted by two methods based on face key points and no face key points,and the learner’s head pose datasets D1and D2are established respectively;Secondly,the head pose features in the two datasets are expanded respectively,and the original three-dimensional feature vector is constructed into six-dimensional feature vector to form a new head pose dataset K1and K2.Then the Bayesian neural network model is used to identify the online classroom attention on the K1and K2datasets;Finally,experiments show that the head pose feature extraction method based on no face key points has a high accuracy of 72.0%in the recognition of online classroom concentration.(4)An online classroom concentration recognition model based on multi-dimensional feature fusion is proposed.The facial expression recognition network and head pose recognition network are fused,and the online classroom concentration is predicted by the method of decision fusion.In the way of decision fusion,maximum fusion and weight fusion are adopted.The experimental results show that the recognition accuracy of online classroom concentration based on multi-dimensional feature fusion is higher than that based on single feature,and the recognition accuracy of weight fusion is the highest,which is 5.0%higher than that based on single expression feature and 9.2%higher than that based on single head pose feature.Figure[39]table[22]reference[108]... |