Font Size: a A A

Research On Non-Linear Mapping Model Based Audio Bandwidth Ex-Tension Coding

Posted on:2018-06-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:L JiangFull Text:PDF
GTID:1368330545999884Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Audio bandwidth extension is a standard technology on contemporary audio co-decs for efficiently coding the high frequency signal with low bitrates.In most cases,the high frequency signal is generated using a duplication of low band and a few high band parameters according to the correlation between high band and low band.It would obtain a good perception quality of coding when the correlation is strong.However,the perception quality of coding would digression if the correlation is weak.For this issue,the existing method mostly increased more parameters to im-prove the perception quality of coding.However,it is also bring the high coding bi-trates.The above replication method is establishing on a linear mapping relation.In fact,the relation ought to be a non-linear mapping because the audio signal is com-plicated and fickle.In this dissertation,we comprehensively investigated the corre-lation between high band and low band by data-driven method.According to the re-search findings of correlation,we desired a non-linear mapping model for generating the high frequency fine structure from low frequency using the deep neural network.Moreover,we presented a multimode coding scheme using the proposed non-linear mapping model.The main contributions of this dissertation are as follows:(1)The research of relationship between correlation and reconstructed qualityIn the existing audio bandwidth extension methods,the high frequency signal is generated by a duplication of low frequency and a few high frequency parameters.This results in digression of perception quality if the correlation between high and low band is weak.The replication method only utilized the correlation on current frame.We investigated the characteristic of correlation and found,the correlation is not only existed on current frame,but also existed on the successive frame.In order to reveal the mechanism of correlation,we proposed a quantitative calculation method of correlation by the mutual information between high and low band.The logarithm spectral distortion is calculated for evaluating the perception quality of coding.The change trend between mutual information and logarithm spectral distor-tion is utilized to analysis the impact of perception quality of coding facing the cor-relation of between high and low band.According to the above investigation,we found the conclusion as following:Firstly,the relationship between correlation and constructed quality show a exponential relationship,and the change trend is concave and decline.If the correlation is weak(e.g.MI<0.1),the perception quality of coding would decline rapidly.Second,the correlation is not only existing in current frame,but also existing in context frame,and the contextual correlation show strong in neighbor frames(e.g.3 frames).Thirdly,the correlation in frequency domain model BWE is stronger than it in source-filter model BWE.(2)The non-linear mapping model between high and low bandIn the existing replication method,the mapping model is defined as linear or quasi-linear.In fact,the mapping model ought to be non-linear because the audio signal is complex and heterogeneous.Therefor,modeling the non-linear relation is necessary for generating the enough accurate high band signals.We also investigat-ed the common non-linear function and found its modeling capacity is limited.In this dissertation,we proposed a non-linear mapping model for generating high fre-quency fine structure using deep neutral networks.We using RNNs and GANs to model the context and cross correlation,respectively.Combine the advantages of RNNs and GANs,we proposed a new RNNs-GANs to model the whole non-linear relation between high and low band.The experimental results showed that there is an excellent performance on baseline system.Compare with source filter and fre-quency domain method,the perception quality of coding is improved 12.05%and 17.60%,the objective quality is improved by 15.15%and 16.68%,respectively.(3)The multimode bandwidth extension coding scheme based on non-linear mapping modelThe above researches showed that the correlation is distinct on time domain and frequency domain for different typical signal,and the perception quality of coding is also distinct for different typical signal on different coding methods.In this disserta-tion,we proposed a new multimode bandwidth extension coding scheme.For speech and music signal,the source filter and frequency domain coding methods are uti-lized,respectively.For generating the high frequency fine structure,we trained two RNNs-GANs networks to model the non-linear mapping from low to high band.In order to resolve the distortion due to the very weak correlation,we proposed the compensation and restraint mechanism.In particular,we extract the high frequency perception parameter for restoring the harmonic,spine interpolate the sub-band en-ergy for restoring the energy distribution,smooth the energy on time domain for removing the "burr" artifacts.The experimental results showed its excellent perfor-mance.Compare with the classical SBR,the subjective and objective quality is im-proved by 5.79%and 13.27%,respectively,and the bitrates decrease by 54.5%.Compare with AMR WB+ BWE,the subjective and objective quality is improved by 20.65%and 26.04%,respectively.Compare with AVS P10 BWE,the subjective and objective quality is improved by 17.03%and 24.45%,respectively.Compare with the state of the art MPEG USAC eSBR and 3GPP EVS BWE,the subjective and objective quality is comparative,and the bitrates is decreased by 71.4%and 47.4%,respectively.Therefor,our method is competitive compare with the state of the art BWE methods,and the bitrates of proposed BWE is decreased significantly.
Keywords/Search Tags:audio coding, bandwidth extension, correlation, non-linear mapping, deep neutral network
PDF Full Text Request
Related items