Font Size: a A A

Nonlinear Reconstitution Of Singing Voice

Posted on:2015-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:M GaoFull Text:PDF
GTID:2268330431455417Subject:Signal and information processing
Abstract/Summary:PDF Full Text Request
The current research of singing voice usually consults the speech signal analysis, using linear model and linear analysis methods. While human vocal system is a complex nonlinear time-varying system, the linear approach is apparently not the best choice. Nonlinear characteristics of singing voice signals from the high-order statistics (HOS) and the chaos theory are in depth studied in this thesis. Singing voice signal reconstitution is completed from the two aspects of the reconstruction from HOS and the prediction under the nonlinear model.Firstly, the singing voice signal reconstruction is investigated through HOS, which is immune to the Gaussian noise, and has a nonlinear relationship with frequency spectrum so that contains much more nonlinear and non-Gaussian information compared with the traditional second-order statistics such as correlation function, power spectrum, etc. Different reconstruction methods are classified according to their theoretical bases. Nonparametric methods include the edge information method, BMU method, Lii method, least squares method, recursive method and DFT reconstruction method. Parametric methods include the harmonic reconstruction method and cpestrum of bispectrum reconstruction method. According to simulation results, the singing voice signal reconstructed by least squares method has best auditory quality. Since the singing voice signals do not fully meet the assumed models in parametric methods, the reconstructed signals with ideal auditory quality cannot be achieved.Secondly, the nonlinear characteristics of singing voice signals are studied in the field of chaos theory. Based on the phase space reconstruction, features of singing voice signals are found similar to those of typical chaotic sequence, which indicates that the singing voice signals do have chaotic characteristics. The referred features include the phase space trajectory, the Lyapunov factor, the main components spectrum, the power spectrum, etc. A nonlinear prediction model combining with the neural network is also built up to predict the unknown samples of the singing voice signals. In the prediction experiments, the different phase space parameter estimations and the different sample rates are taken into consideration. The results show that the joint estimation of the optimal delay and embedded dimension brings better prediction. When the signal is rolling predicted, the result is acceptable in short term. The long term prediction is affected by cumulative error.Finally, the Volterra series expansion is adopted in the nonlinear fitting to the singing voice signals. Signals are predicted under both the time domain second-order Volterra model and phase space second-order Volterra model. The different methods of getting kernel factors, the different prediction ranges and the different sample rates are considered. Through the experiment results, we can tell that:the adopting singular value decomposition in determining kernel factors, the shorter prediction range and the higher sample rate are better for singing voice signal prediction; the phase space second-order Volterra model performs better than the time domain second-order Volterra model in predicting low sample rate signals.
Keywords/Search Tags:singing voice signal, high-order statistics, chaos, reconstruction, prediction
PDF Full Text Request
Related items