Font Size: a A A

Singing Speech Recognition Based On Convolution Neural Network

Posted on:2019-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:X Q WuFull Text:PDF
GTID:2348330545999949Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The quality and state of voice in singing art is of great importance to scientific selection,teaching,training of talented singer performers,and voice disease diagnosis.It is necessary to study an effective method to evaluate it scientifically and objectively.However,there are still many problems to be solved,such as single research of acoustic parameters,low efficiency of information fusion,poor robustness of algorithm,low recognition accuracy under low SNR,and inadequate utilization of information in the evaluation stage.To solve the above problems,this paper adopts convolutional neural network(CNN)as the basic network,through the pretreatment and the parameter optimization and structure adjustment of the CNN network,improving the traditional two-dimensional CNN(2DCNN)network to the one dimensional CNN(1DCNN)network that is more suitable for one-dimensional sound signals.Finally the quality evaluation method of singing art voice based on1 DCNN network is proposed.First,the voice signal is preprocessed.The singing acoustic parameters are extracted including the first formant,the third formant,the fundamental frequency,the pitch,the fundamental frequency perturbation,the first formant perturbation,the third formant perturbation,the average energy and so on using speech analysis technique,and these basic features are fused and reorganized to form the input signal.In addition,the low frequency coefficient signals are reconstructed by wavelet decomposition,and the reconstructed voice signal with noise is detected and analyzed combined with the idea of high order cumulant.Eventually a method of pitch detection based on improved wavelet transform is proposed.Experiments show that the proposed method can improve the accuracy of pitch detection under low signal-to-noise ratio,and the computation is small,and the spectrum and information of voice signal remain intact.Then,an improved 1DCNN model for one dimensional sound signal is proposed.This model,which can describe the time-varying characteristics of one dimensional voice signal better,is improved by improving and adjusting the structure of the traditional 2DCNN.Moreover,aim at the long time-consumingproblem for the CNN training process,the idea of fractional order neural network is fused and the method of using the fractional order processing node of Sigmoid function is proposed,and at the same time,an improved 0.5 fractional1 DCNN model is proposed.Experiments show that the convergence speed of the proposed model is faster,and the training time of CNN is shortened,and the accuracy of voice quality evaluation system is 85.7%,which is 5.4% higher than that of traditional 2DCNN network.Finally,a method of singing voice quality evaluation based on 1DCNN model is proposed.The simulation experiment is completed on the Matlab R2016 a platform,and the result of error statistics is obtained after the comparation between the results of the predictive evaluation and the subjective evaluation results of the professional.And there is a comparative analysis Experiment among the error back propagation(BP)neural network,wavelet neural network,traditional 2DCNN network and our merhod.The experimental results show that the average error of the proposed method is 0.23,which is 0.50 lower than the BP neural network and 0.33 lower than the wavelet neural network.The method proposed in this paper that quality evaluation of singing voice based on CNN preferably solves the problems of feature information fusion and availability,pitch cycle detection under low signal to noise ratio,construction and training efficiency of one-dimensional convolutional neural network.It can objectively and effectively evaluate the voice quality level of singing art,with good robustness and transportability.It can be used not only in singing voice evaluation,but also in voice disease diagnosis,etc.,which has great application value.
Keywords/Search Tags:convolution neural network, acoustic parameters, artistic voice, pitch cycle, quality evaluation
PDF Full Text Request
Related items