Font Size: a A A

Optimization Of Sound Fields Feature Reproduction Based On Data Driven Method

Posted on:2022-07-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:L K ZhangFull Text:PDF
GTID:1488306737461834Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years,with the gradual popularity of applications such as on-demand movies on the Internet,it has become increasingly common for people to watch movies on TVs,cell phones,and digital terminals such as computers.On the other hand,traditional stereo and other surround audio technologies have become difficult to meet the increasing demand for audiovisual experiences.Three-dimensional(spatial)audio technology can be used to create an immersive spatial listening experience and is widely used in applications such as entertainment and virtual reality.With the rapid development of three-dimensional(spatial)audio technology,relevant research institutions at home and abroad and the corresponding standardization organizations have invested a lot of work,how to easily(the number of hardware and related arrangement requirements low)to create a sense of the three-dimensional spatial experience of the auditory space has become a research hotspot in the field.A loudspeaker array is placed at the target location to reproduce the original acoustic environment to achieve a three-dimensional listening experience.At present,the sound fields reproduction with home theater,cell phones,tablets and computers,and other digital terminals still have some shortcomings,which restrict the threedimensional audio experience.At first,the number of speakers must reach a certain scale,in order to give a certain degree of three-dimensional listening experience.And as the number of speakers decreases,the reproduction result will become worse;Secondly,the speaker array must be arranged in accordance with the requirements of the arrangement.At present,the state-of-the-art sound field reproduction system(higher-order Ambisonics technology)requires speakers need to be arranged according to the rules(such as a spherical,ring),while the speaker arrangement on the home,cell phone,and tablet is inconsistent with the theory.Thirdly,the environmental reverberation will have a complex impact on the reproduced sound fields and remains some research problem.For the above-mentioned needs and challenges,this thesis firstly researches the data-driven sound field representation and then studies the data-driven sound field optimization reproduction based on the data-driven representation of the sound field in the case of a small number of speakers or irregular arrangement,and on the other hand,improves the 3D audio experience of listeners in another way by studying the data-driven sound field listening experience optimization reproduction.(1)Research on data-driven sound field representationBefore reproducing the three-dimensional sound field or analyzing the corresponding data,it is necessary to represent the sound field data first.The corresponding research content is the sound field record technology.Traditional sound field record techniques are mainly based on the theoretical basis of the spherical harmonic representation of the sound field,which is limited by the dimensionality theorem and has a strong limitation on the data dimensionality required to represent the continuous sound field.All the current public sound field data contain only lower-order spherical harmonic coefficients,which can only accurately represent the sound field at low frequencies or in a small range,while the recording of sound field information in a large range(or at higher frequencies)requires the use of arrays containing a large number of microphones(harsh hardware conditions),which makes it difficult to obtain rich sound field data for data analysis.In this thesis,which caused the errors in the collected higher-order spherical harmonic coefficients are analyzed.When the number of microphones is small according to the theory of spherical harmonic representation of the sound field,and establish a data-driven sound field estimation model.Finally,a data-driven sound field data enhancement method is designed by applying the neural network algorithm to the sound field estimation model.The data-driven sound field representation is achieved through neural networks,which confirms the effectiveness of data-driven sound field representation and proposes a new idea beyond the traditional physical model representation to improve the performance of sound field representation and reduce the need for equipment.(2)Data-driven based sound field optimization methodThe data-driven sound field representation can obtain the relevant information of the sound field from the data set,which reduces the dimensionality of the sound field representation and can be used to improve the reconstruction performance in the sound field reconstruction.The traditional 3D sound field reconstruction technology in the application of 3D(spatial)audio requires the use of the large-scale regular arrangement of loudspeaker arrays,and it is difficult to achieve the corresponding hardware requirements in the home environment,on the other hand,when the number of loudspeakers is small or irregular arrangement,the existing technology does not have a targeted solution,which will lead to the reduction of the range of the sound field that can be effectively reconstructed and the improvement of the reconstruction error.In this thesis,we study the mechanism of sound field expression based on loudspeaker configuration,learn the sound field characteristics that can be accurately expressed by the current loudspeaker array(i.e.,the expression capability of the loudspeaker array)based on the sound field data generated by the current loudspeaker array,establish a sound field optimization reconstruction model,and design a datadriven sound field optimization reconstruction method using the expression capability of the loudspeaker array.In the experimental verification,the average reconstruction error of the 4-speaker array optimized by this method is 30% in the region of 16 cm radius(frequency of 1000Hz),while the average reconstruction error of the unoptimized reconstruction by the classical HOA technique is 62%,and the optimization effect is obvious.(3)Data-driven based sound field experience optimization methodFor the traditional sound field reconstruction technology mainly focuses on the reconstruction of sound pressure field,when the number of speakers is small(or irregularly arranged),it is difficult to reduce the sound field reconstruction error to a sufficiently low problem,making it impossible to effectively carry out the problem of three-dimensional audio experience in the case of a small number of speakers(or irregularly arranged).We combine the physical quantities related to the subjective listening quality(particle velocity)in the optimal reconstruction process to weight the reconstruction results and guarantee the subjective listening effect from another perspective.In this thesis,we extend the single-variable optimization model to a multivariable co-optimization model based on the sound field optimization reconstruction model through a data-driven approach,and design a data-driven reconstruction method based on the optimized sound field listening effect using the extracted features,which improves the theoretical framework of data-driven sound field reconstruction optimization.
Keywords/Search Tags:Three-dimension audio, Spatial audio, Sound field reproduction, Loudspeaker array, Data-driven method
PDF Full Text Request
Related items