Font Size: a A A

Research On Emotion Classification Of Virtual Reality Scenes Based On Joint Classification Network

Posted on:2022-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:C D LuFull Text:PDF
GTID:2518306569979169Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Emotions are the psychological and physical states produced by a variety of human feelings,thoughts and behaviors,and they play a very important role in human social activities.The induction and recognition of emotions is a hot topic in the field of emotion research.It has important application and research value in the fields of game design,psychotherapy,health monitoring,and psychological research.Virtual reality scenes have a strong sense of immersion and high dimensionality,and its application in the field of emotion induction has received extensive research and attention.At present,there are the following problems in the research of emotion induction and recognition based on virtual reality scenes:(1)The existing emotion induction methods include visual stimulation,sound,smell and multi-channel induction,and there are few emotion induction materials based on virtual reality technology;(2)The SAM scale is often used to mark emotions in virtual reality scenes.It costs a lot of paper,reviewing manpower and time,and the concept explanation of the three dimensions of the SAM scale is complicated and timeconsuming.The SAM scale may be difficult to understand for people who first contact it.In addition,there have the problem of unbalanced samples in each category of the current datasets used for emotion research;(3)One frame of the virtual reality scene is a spherical image,which cannot be directly used as the input of the two-dimensional convolutional neural network.Thus we need to project the entire spherical image to obtain a plane image.The existing projection methods have some shortcomings: the image after equidistant cylindrical projection is distorted,and each angle of view after cube projection can only obtain partial information of the original spherical image;(4)In the process of image reconstruction,transposed convolution is often used for up-sampling.If the parameters of transposed convolution are not set properly,the reconstructed image will appear "checkerboard effect",which will cause the reconstruction effect to deteriorate.In view of the above problems,the following studies are carried out in this paper:(1)A virtual reality scene emotion-induced dataset is established.Use the improved Kmeans algorithm in this paper to extract the theme color from the image data set to generate a mood color palette,and complete the view design and production of the virtual reality scene under the guidance of the mood color palette.The original database is composed of 25 panoramic videos screened on the Internet and 6 scenes made by Unity3 D platform.The scenes are marked by the SAM scale.After marking the scenes by the SAM scale and according to the evaluation results of the SAM scale and the subjects' experience feedback,19 scenes are finally retained,including 3 positive scenes,9 negative scenes,and 7 neutral scenes.(2)Based on the deep learning method,a single-class-multi-class joint network model is proposed.The network adopts a hierarchical structure.The single-classification network combines the generational adversarial idea and the autoencoder to distinguish between normal samples and abnormal samples,and the multi-classification network classifies normal samples again.The network training process does not require the participation of samples with a small amount of data,which solves the problems of difficulty in data collection with small samples and poor prediction effect.Compared with the traditional SAM scale,the joint network makes the scene labeling intelligent.(3)Aiming at the characteristics of spherical image cube projection,channel fusion and attention mechanism are introduced into the input module of the proposed joint network model.The front,right,rear,and left four plane images obtained by projecting the spherical image cube are merged in the channel dimension and passed through the attention module.On the one hand,the information of the input image is increased,and the noise and interference are filtered on the other hand.After introducing the channel fusion and attention module,the AUC of the single-classification network GANomaly reached 0.850,and the accuracy of the multiclassification network Res Net50 reached 97.5%.(4)A dual-branch up-sampling method is proposed,which applies transposed convolution,sub-pixel convolution and attention mechanisms to the up-sampling module of the proposed joint network model.After adopting the dual-branch up-sampling method,the average AUC of the GANomaly model on the four data sets reaches 0.765,which is higher than the transposed convolutional up-sampling and the sub-pixel convolutional up-sampling.After channel fusion,attention module and dual-branch up-sampling methods are used in the joint classification network model,the classification accuracy rate on the data set reaches82.52%.Based on virtual reality technology,this paper establishes a new emotion-inducing material database;and based on the idea of deep learning,proposes a single-class-multi-class joint network classification algorithm to solve the intelligent labeling and imbalance of virtual reality scenes.The classification of data sets provides a new way of thinking.The work of this paper provides a meaningful reference for the application research of information technology in the field of emotions.
Keywords/Search Tags:Emotion Induction, Virtual Reality, Channel Fusion, Attention Mechanism, Data Set Imbalance
PDF Full Text Request
Related items