Research On Singing Voice Conversion

Posted on:2017-12-09

Degree:Master

Type:Thesis

Country:China

Candidate:P Fang

Full Text:PDF

GTID:2348330485951794

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Speech signal contains linguistic information and acoustic information of the speaker.Voice conversion is a technique which can modify the acoustic information of voice spoken by one source speaker to be perceived as the voice spoken by another specific speaker with the linguistic information unaltered.Generally,voice conversion is re-alized by change the timbre and pitch of soure speaker.A wide variety of work of voice conversion has been done and the conversion technology is mature to some ex-tent.Although singing voice conversion is similar to voice conversion,it is not widely researched with the reason that singing voice conversion is more professional and dif-ficult conpared to voice conversion.Under the above background,this paper go into more details about singing voice conversion.We proposed some algorithms to convert the singing voice and building a complete system of singing voice conversion.The main contribution of this dissertation is organized as follow:1.In order to achieve the conversion of singing voice,we recorded a singing voice database of a professional singer(source singer).We also recorded a database of target singer to extract his voice feature and evaluate the algorithm of singing voice conversion.Considering the pitch inconsistency of source and target singer may lead to conversion error,the singers are required to sing according to the score’s pitch.We have recorded a total of 132 minutes long singing voice in Chinese,which provide reliable database for singing voice conversion.2.In conventional voice conversion,the speech quality of converted singing voice suffers from fundamental extraction errors and excitation signal producting error.In order to improve the converted singing voice quality,we apply the mel log spectrum approximation(MLSA)filter to synthesize the converted singing voice by filter the source singing waveform.According to the experiment,the new method can obtain better converted singing voice.3.Although the Gaussian mixture model(GMM)based conversion method is ex-cellent,it is easy to be overfitting when the dataset is small.It is difficult to get enough suitable singing voice in the application of singing voice convertion,so we propose to adopt the kernel fuzzy clustering and partial least squares(PLS)re-gression based singing voice conversion method which can get better conversion result compared to the GMM based method when the training data is inadequate.

Keywords/Search Tags:

singing voice conversion, singing voice, kernel fuzzy clustering, partial least squares regression, mel log spectrum approximation filter, spectrum

PDF Full Text Request

Related items

1	Statistical Model Based Mandarin Chinese Singing Voice Synthesis
2	A Study On Pitch Based Beautification System Of Singing Voice
3	Research On Resonance State Of Singing Voice Signal
4	Research On Synthesis Methods Of Singing Oriented To Timbre Conversion
5	Research On Mandarin Singing Synthesis Based On Wavenet Architecture
6	Melody Extraction From Singing Voice Of Polyphonic Music
7	Nonlinear Reconstitution Of Singing Voice
8	Research On Mandarin Singing Synthesis Based On Deep Learning
9	Research On Singing Voice Separation Of Mono Music
10	Monaural Singing Voice And Accompaniment Separation Research Based On U-Net Architecture