Font Size: a A A

Research On Singing Voice Conversion

Posted on:2017-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:P FangFull Text:PDF
GTID:2348330485951794Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Speech signal contains linguistic information and acoustic information of the speaker.Voice conversion is a technique which can modify the acoustic information of voice spoken by one source speaker to be perceived as the voice spoken by another specific speaker with the linguistic information unaltered.Generally,voice conversion is re-alized by change the timbre and pitch of soure speaker.A wide variety of work of voice conversion has been done and the conversion technology is mature to some ex-tent.Although singing voice conversion is similar to voice conversion,it is not widely researched with the reason that singing voice conversion is more professional and dif-ficult conpared to voice conversion.Under the above background,this paper go into more details about singing voice conversion.We proposed some algorithms to convert the singing voice and building a complete system of singing voice conversion.The main contribution of this dissertation is organized as follow:1.In order to achieve the conversion of singing voice,we recorded a singing voice database of a professional singer(source singer).We also recorded a database of target singer to extract his voice feature and evaluate the algorithm of singing voice conversion.Considering the pitch inconsistency of source and target singer may lead to conversion error,the singers are required to sing according to the score's pitch.We have recorded a total of 132 minutes long singing voice in Chinese,which provide reliable database for singing voice conversion.2.In conventional voice conversion,the speech quality of converted singing voice suffers from fundamental extraction errors and excitation signal producting error.In order to improve the converted singing voice quality,we apply the mel log spectrum approximation(MLSA)filter to synthesize the converted singing voice by filter the source singing waveform.According to the experiment,the new method can obtain better converted singing voice.3.Although the Gaussian mixture model(GMM)based conversion method is ex-cellent,it is easy to be overfitting when the dataset is small.It is difficult to get enough suitable singing voice in the application of singing voice convertion,so we propose to adopt the kernel fuzzy clustering and partial least squares(PLS)re-gression based singing voice conversion method which can get better conversion result compared to the GMM based method when the training data is inadequate.
Keywords/Search Tags:singing voice conversion, singing voice, kernel fuzzy clustering, partial least squares regression, mel log spectrum approximation filter, spectrum
PDF Full Text Request
Related items