Font Size: a A A

Research On Multiple Audio Object Coding Technology Based On Aliasing Distortion

Posted on:2020-05-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:T Z WuFull Text:PDF
GTID:1368330620452207Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of three-dimensional(3D)video technology,3D audio technology also becomes more and more attractive.The 3D audio system can provide immersive audio listening experience by reconstructing the spatial sound image with three freedom degrees in horizontal,vertical and distance.Traditional 3D audio systems are mainly based on multi-channel,such as NHK 22.2,a 3D multi-channel audio reference system recommended by the International Standard Organization MPEG.The system can reconstruct the spatial audio scene by playing multi-channel signals through 24 speakers at specified spatial locations.However,channel-based 3D audio system also has some limitations.It has fixed requirements for the number and spatial location of speakers,so it is difficult to convert the audio resources in different playback environments(different number of speakers or different speaker spatial locations).In addition,channel signals always contain multiple audio object(sound source)signals,which makes it difficult to achieve independent control of audio object.Therefore,the contradiction between the limitations of channel-based 3D audio system and the demand for personalized multimedia content services becomes more and more prominent.In order to make up for the shortcomings of the traditional 3D audio system,the object-based 3D audio system is proposed.In this system,each object signal is rendered separately,so changing the attribute of one object signal will not interfere with other object signals.Moreover,the rendering process of object signal is based on the number and spatial location of speakers,which can realize the perceptual consistency of audio scene reconstruction in different playback environments.Object-based 3D audio can better meet the growing demand for personalization,but it also faces new challenges.There are a lot of audio objects in the audio scene,which leads to the huge amount of data of object-based 3D audio resources.In order to achieve efficient transmission of multiple audio object signals,many audio object coding techniques have been proposed.The existing object coding technology mixes multiple audio object signals to realize joint coding.Although it can achieve high compression efficiency,there will be perceptible components of other object signals in the reconstructed audio object signal,which is named frequency aliasing distortion.The frequency aliasing between object signals will reduce the independence of the signal,so it is hard for users to completely remove an audio object or play an audio object alone.On the other hand,aliasing distortion will also lead to the system unable to accurately reconstruct the spatial location information of the audio object.To cope with the aliasing distortion in existing object coding methods,this paper focuses on the following three aspects to improve the quality of audio object coding.(1)Conditions and main influencing factors of aliasing distortion in object codingThe cause of frequency aliasing distortion is not clear so far,and the existing methods mostly use post-processing to reduce aliasing distortion,but the effect is not satisfactory.We analyze the typical model of the existing audio object coding method,deduce the processing flow of key modules and make appropriate formulaic expression to determine the estimation model of aliasing distortion.Then determine the generation conditions and main influence factors of aliasing distortion.The experimental results based on general data sets shown that,by changing the main influence factors of aliasing distortion,aliasing distortion can be effectively reduced and the quality of object signal coding can be improved.Therefore,the research in this paper can provide a new solution to cope with the frequency aliasing distortion.(2)Research on perception characteristics of aliasing distortion and efficient audio object coding methodTo reduce parameter coding bitrate,many existing coding methods extract the object parameters based on sub-bands,which will result in inaccurate reconstruction of frequency components energy and cause serious aliasing distortion.In this paper,we subdivide each of the original ERB sub-band into several sub-bands to obtain different parameter frequency domain resolution,and encode the audio objects.Then we carry out experiments to estimate the variation law of aliasing distortion and determine the minimum frequency resolution to guarantee there is no perceptual aliasing distortion generate.Experimental results show that the frequency resolution corresponding to 318 sub-bands is the minimum frequency resolution for realizing no perceptual aliasing distortion.In this paper,an object coding framework based on matrix decomposition is proposed.Compared with the existing coding methods,the proposed method can achieve better coding quality at a lower parameter coding bitrate.(3)Reseach on audio object coding method based on multi-frequency domain resolution fusionThe dimensionality reduction compression methods,such as non-negative matrix decomposition,etc.,can reduce the parameter coding bitrate.But these algorithms need to be processed based on the complete signal spectrum or parameter matrix,which is not suitable for real-time application scenarios.In order to solve this problem,this paper proposes a sub-band partition method based on the perceptual difference to frequency of human ear.We also propose an audio object coding method based on multi-frequency resolution fusion,and the framework can achieve audio object coding without perceptual aliasing distortion at relatively low coding rate.The experimental results based on general audio database show that the proposed method can achieve better coding quality and improve subjective sound quality more than 10% without significantly increasing the coding bitrate.This method provides a new solution for high quality and high efficiency coding of multiple audio object signals in streaming media application environment.
Keywords/Search Tags:3D audio, audio object coding, frequency aliasing distortion, dimensionality reduction, frequency resolution fusion
PDF Full Text Request
Related items