| In recent years,the issue of safe driving has received increasing attention,and the impact of safe driving including vehicles,roads,drivers and other factors.Among them,the driver is a subjective factor,and his technical operation and emotion can affect driving safety.Furthermore,emotion can also in turn affect the technical operation.Therefore,the research of driver emotion recognition has great significance.In the study of emotion recognition models,scholars have explored multiple modalities such as visual,speech,and text.Early emotion recognition was mainly based on single modality,and the performance of classifiers was limited due to the restricted emotional information provided by single modality,so researchers proposed multimodal sentiment recognition.Human emotions are expressed mainly through two modalities: visual and speech.Based on this,this paper proposes a visual and speech multi-modal emotion recognition system with the driver as the detection object to achieve real-time feedback and prevention,ensuring safe driving.The research approach of this paper is to first extract the emotion features of visual and speech modalities,analyze the correlation between the features of different modalities,establish emotion recognition models using the techniques of machine learning and deep learning methods,and design a multimodal emotion recognition system for car drivers.The main research contents are as follows:A multi-level decision fusion emotion recognition model based on fuzzy rules is designed for the information redundancy and weight assignment problems that arise in the process of multimodal feature fusion.Firstly,facial expression recognition model and speech emotion recognition models are established respectively.Feature selections based on mutual information,regression tree(CART)and support vector machine(SVM)algorithm are conducted to identify key features.The filtered features are input into multiple SVM classifiers to obtain the classification confidence values,and a decision-level fusion model is designed.Then,based on classification results of visual expression and speech modality,fuzzy rule method is used to assign the modality weights to complete the decision fusion work.In order to verify the performance of the model,experiments are conducted based on the SAVEE dataset,and the experimental results show that the model effectively improves the emotion recognition rate.In addition,the experimental results also demonstrate that the fuzzy rule can still exhibit good recognition performance even in cases where expression recognition performance is insufficient.Although the above emotion recognition model has certain robustness,the decision-level fusion approach ignores the problem of information interaction under multimodal channels.In response,this paper explores multimodal feature-level fusion based on improved Transformer architecture.Firstly,the features are extracted from speech and visual modality separately,and the extracted features are concatenated.Then,the stitched features are input to the convolution layer to obtain local features.Next,the features processed by the convolution layer are inputted into Encoder module of Transformer network for training to learn deeper features.Finally,the learned features are fed into the fully connect layer to complete the sentiment classification.The performance of the model is also validated based on the SAVEE dataset.Through feature fusion experiments and comparative experiments,the results show that the algorithm can achieve good emotion recognition accuracy.Based on the establishment of emotion recognition models,this paper designs a multimodal emotion recognition system.This paper recorded audiovisual clips of car drivers in driving environment,which are used as the system input for driver emotion judgment.The system mainly consists of four modules: data file preprocessing,emotion recognition algorithm,display of classification result and emotional state warning.The system test results show that it can effectively detect and analyze the emotions of car drivers,so as to improve car driving safety. |