| As an important multimedia information carrier,audio signal contains a lot of information elements,but the attention resources of human brain are relatively limited.The research of human ear physiology shows that human brain has the characteristics of focusing on the received audio signal,filtering the audio signal and giving more attention resources to the high attention areas.The mechanism of human auditory attention includes two kinds of perceptual processes: random and non random.Based on its own prior knowledge,non random perception will focus on specific audio signals in a specific scene,while random perception is generally caused by the environment directly,and the events that cause the perception tend to be universal to all people.Understanding the mechanism of human auditory perception and simulating the whole process of human auditory system’s attention to audio signal have important practical significance in the current audio attention research and many audio engineering application fields,especially the simulation of random auditory perception process can greatly reduce the complexity of subsequent audio signal processing,which is important for audio monitoring,video summarization and artificial intelligence There is a certain value in the field of audio and video processing.As a measure of human ear’s attention to audio signal,audio attention can be calculated from bottom-up and top-down.Top-down method relies on human’s prior knowledge,while bottom-up modeling is a fast detection method,which is more suitable for engineering application.At present,most of the mainstream bottom-up attention calculation methods use image significance correlation algorithm to process the voice spectrum of audio signal to get the final attention value.However,there are differences between audio signal and image attention mechanism.Image attention is based on spatial area.By comparing image features in a certain range to simulate the attention mechanism of human visual system,we can get the attention area under a certain feature,which often ignores the characteristics of the time dimension of audio attention events.The attention of audio signal often has persistence and attenuation in the time dimension Characteristics,persistent characteristics show that the attention of human ear is often a continuous process,and the attention of human ear tends to appear attenuation characteristics with the passage of time.In view of the above problems,the main work and innovation about the audio attention are as follows:(1)Firstly,the physiological characteristics of human ear and its mechanism of concern are described,and the pre-processing effect of specific structure of human ear on audio signal is analyzed.According to the pre-processing process,this thesis uses gamma tone filter and Meddis mathematical model to simulate the peripheral processing of human ear related organs.(2)After auditory peripheral processing,through the image channel and audio channel to calculate the attention value respectively,the local entropy value which can reflect the audio attention is obtained.To a certain extent,the two channel attention calculation model can comprehensively consider the characteristics of image and audio signal,which can further improve the accuracy of the algorithm.(3)Finally,the time trend correlation algorithm is used to fuse the whole time dimension,which can reflect the continuity and attenuation characteristics of the whole attention in the time dimension.The experimental results of the two channel local entropy and time trend based audio attention calculation method in this thesis are comprehensively analyzed.The experimental results show that compared with the classical attention calculation model,the model proposed in this thesis,on the one hand,has better attention event detection accuracy,and can show the continuity and attenuation characteristics of audio attention process.From the aspect of complexity of feature calculation,the method of modeling based on local entropy in this thesis has some advantages over the method of multi feature concern calculation. |