Font Size: a A A

Research On Emotion Content Recognition In Music Video

Posted on:2014-02-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:W LiFull Text:PDF
GTID:1228330401463065Subject:Digital media technology and applications
Abstract/Summary:PDF Full Text Request
In recent years, with the development of computer network technology and digitalmedia processing technology, digital video, images, audio is growing and applicationsare becoming increasingly popular. The organization and retrieval based on the semanticcontent of the media information retrieval become an urgent problem to solve. However,due to diferences in culture background, everyone has diferent criteria of audio-visualmedia and their feelings are all diferent, especially in the media emotional semanticunderstanding. Therefore, the research on the emotional cognitive identified researchhave an important meaning to enhance the efcency of the digital media annotation,retrieval, and digital entertainment products emotional interaction ability.Emotion is one of the characteristics of the video, image, and the basic character-istics of music. In this paper, from the individual’s emotional cognitive perspective, thepersonalization emotional content of music video is recognized with a machine learningmethod from visual and auditory features of the music video, to bridge semantic gap be-tween the visual and auditory low-level features and high-level of human emotion seman-tics. Focus on the structure of the training set of the music video annotation, emotionalmodel with emotion subspace establishment of audio-visual features and music theoryfeature extraction and music video personal emotion recognition, and the establishmentof music video summarization. The main research work and innovations include:Firstly, the personal afective subspace of user about music video is established.The music video is a kind of audio-visual media with great relevance to personal emo-tional preferences. To represent personal emotion characterization, a new psychologymodel is set up to express individual discrete and continuous emotional values, which isArousal, Valence, Preference in the paper. The psychology reaction values are markedusing Likert scale. In order to improve the performance of the individual personal-ized emotional space. Finite Mixture of student’s factor analyzer with the Kuiiback-Leibler Fuzzy c-means (MSFA-KLFCM) is applied to divide emotion subspace. Thet-distribution mixture model(TMM) is used to estimate the degree of membership of theemotion subspace. The experiment results show that the individual personalized musicvideo emotional sub-division of space can be expressed efectively.Secondly,The audio-visual features of music video are extracted. The emotionrecognition of music video is based on its unique visual and auditory features. Music canexpress almost all kinds of human emotion. From music knowldege theory and musicpsychology, a group of emotional audio-visual features are selected. Chord as advancedmusic theory is used to express the emotion of music. The chord histogram is introducedas features. And a new chord recognition method is put forward based on the the res-onance frequency of the image (RTFI) to analyze spectral characteristics of frequency. A new salient pitch profile feature is brought forward. The Expectation-maximizationalgorithm is used to recognition the pick of the chord template. The beat characteristicsis used as the post-processing to improve the recognition accuracy. Experiments resultsshow that the algorithm has high recognition accuracy and strong robustness.Thirdly, the localized multiple kernel learning regression algorithm is introduced tothe personalized emotion recognition.The music video audio data has temporal dynamic-s. In this paper, a characterization and dynamic nature of dynamic texture (Mel cepstrumchroma spectrum) is presented to capture musical appearance and dynamic features. Thewhole music regarded as a linear dynamic system, and the bags of system histogram asdynamic texture is used in the music video emotion recognition system. Diferent visualand auditory features of music video have difernt role to identify personalized emotioncontent of music video. The localized multiple learning regression algorithm is put up toidentify personalized emotion emotional value of music video. The experimental resultsshow that the recognition system combining bags of system histogram and chord his-togram can more efectively identify the personalized emotional content of music video.Finally, music video summarization generation algorithm is put up based on theimage visual complexity. In this paper, music video keyframes are extracted to generatethe static summarziation based on visual image complexity. A new shot segmentationdetection algorithm is put forward to divide the music video sequence into shots. Theimage visual complexity as a similar mechanism is used to extract the keyframe can-didates. There are some information redundancy of these keyframes. The hierarchicalfuzzy C mean clustering is used to cluster and these keyframes are extracted from theclusters and generate video summarization.The objective evaluation criterias are used toevaluate video summarization produced. The experimental results show that the videosummarization produced by proposed methods has good compression rate, fidelity, andshot rescontruct degree.The research work of the paper is expanded according to the emotional cognitiveneeds of users.The mapping realtion between the visual-auditory characteristics and theemotional values of user are studied, which can be helpful for users to get their interestedvideos and meet their emotion state from huge audio-visual video database.At the sametime, the results of music video afective cognitive research can provide some new ideasfor the applications of digital media emotional cognitive reseach.
Keywords/Search Tags:Music Video Retrieval, Emotiona Computation, Emotion Recognition, E-motion Subspace, Chords, Dynamic Textures, Linear Dynamic System, Video Summa-rization, Visual Complexity
PDF Full Text Request
Related items