Humans express their emotions through music.Music is an important medium as it allows the creator to express themselves and engage with the listener on an emotional level.As technology further evolves it allows for much deep emotional engagement among music listeners.With the rapid development of Music Emotion Recognition(MER),it has become a hot topic to investigate the correlation between music and emotional perception has inspired many researchers.On this basis,many researchers have proposed that dynamic Music Emotion Recognition,needs to be able to extract emotional data from music at each time point so that analyzing the dynamic regularity of emotion and musical elements can be done in a more detailed way.Currently,the analysis of musical elements in dynamic MER research often adopts algorithmic extraction or quantitative calculation methods based on musical theories,which lack subjective perceptual information on different periods.As for data annotation,there is a lack of theoretical support for the setting of the time sampling rate for the data set;and the manual dynamic annotation of musical elements has also rarely been attempted due to the high difficulty of data annotation.This paper will focus on constructing a Western classical music dataset by dynamically and manually labeling emotion data and subjective perception of musical elements,to analyze the dynamic association between emotion perception and musical elements throughout periods of change,explore the suitability of time resampling rate for dynamic labeling of MER,and analyze the complex process of people’s perception of music.The main points and conclusions of this paper are as follows:(1)Dynamic perceived emotions and eight musical elements containing expressive and structural classes in Western classical music were manually annotated.The experimental results showed that the manual dynamic annotation of emotion and music elements in the dynamic MER study is feasible due to the good consistency of the data.(2)The time series of emotion and musical elements formation was analyzed using a topdown segmentation method.The results showed that the optimal time of sampling rate of valence and arousal ranges from 2~3s;the time sampling rate of the musical elements increases with the cognitive difficulty of the elements themselves,and the optimal time sampling rate is between 2~6s.(3)By comparing the temporal calculation of real-time and integrated,it is found that time has an additive effect on musical perception.Based on this,and combined with the optimal time resample rate,this paper proposes a perceptual hierarchical model of music.In the process of music perception,compared with structural elements,expressive elements can be perceived faster when music changes;and music elements with slower perceptual changes will be affected by the additive effect of music elements with faster perceptual changes in time.(4)A partial least squares regression analysis was used to establish static and dynamic associations between emotion perception and musical elements,and to derive important musical elements affecting emotion perception.The results showed that the important musical elements that affect the valence in dynamic annotating,differed from those in static annotating;the results for the musical elements also showed that the effects on the perception of arousal were similar,and the important musical elements in the perception of valence were more susceptible to temporal changes.Comparing the important musical elements affected by dynamic emotion perception across music reveals that each musical style differs from one musical period to another,so the important musical elements that influence emotional perception are also different. |