| With the promotion of artificial intelligence in the field of education,more and more schools begin to use information technology to carry out online teaching.This teaching mode is not limited by time,and teachers and students do not need to have face-to-face communication in the classroom.Classroom atmosphere is an important factor affecting the quality of teaching,teachers often condition the classroom atmosphere through mobilizing their emotion in the traditional class.But teachers lost their emotional interaction with the whole class in online teaching.Teachers may find it difficult to consciously control emotions,or they may have a negative impact on the teaching process because of their depressed emotions.Therefore,emotion recognition and analysis of online teaching videos will provide a new approach for intelligent evaluation and improvement of online teaching.Based on the study of the characteristics of teachers’ teaching videos and the related work of multi-modal emotion recognition for videos,this paper focuses on the segmentation of different emotion in long videos and the emotion recognition for teaching videos.For the problem of segmentation of different emotion in long videos,we proposed an efficient algorithm for searching emotion conversion points in long videos.Firstly,we designed a neutral emotion segment filtering algorithm based on facial features to filter out long neutral segments in teaching videos,which is because that teachers often spend a lot of time in neutral emotion state during a lesson.Then,we designed a two-stage emotion conversion point search algorithm based on speech features using for the remaining videos.In the first stage,we use the loudness threshold line to screen the potential conversion point set.In the second stage,we use the sliding dual-window based on KL distance measure to search for up to two rounds,so as to find the real emotional conversion point quickly.For the problem of the emotion recognition for teaching videos,we proposed a multi-modal emotion recognition model for teaching videos.Firstly,we designed a semi-supervised iterative feature normalization algorithm for the preprocessing of original speech and facial image features,which kept the differences between different emotions while eliminating the differences between teachers’ personalities.Then,we designed a multi-modal emotion recognition model based on deep learning,and we use the attention mechanism to perform feature-level modal fusion,which can calculate weight for features automatically,providing accurate emotion classification for users.Finally,we realized the analysis system for the teacher’s state of online class based on the above algorithms.The experiment results show that the emotion conversion point recognition algorithm for long video is feasible and efficient,and the coverage of real emotion conversion points reaches 92.58%.The accuracy of our multi-modal emotion recognition model reaches 83.2%.The analysis system for the teacher’s state of online class provides users with complete emotion recognition results and statistical analysis display.And we provide a good basic framework and auxiliary data for the evaluation of classroom atmosphere of online class and the evaluation and improvement of teachers’ online teaching quality from the emotional aspect. |