Font Size: a A A

Hardware Implementation Of Statistical Parametric Speech Synthesis

Posted on:2018-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2428330515995577Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Artificial intelligence development contributes to the speech synthesis technology widely application to embedded devices that made life more convenience such as voice orientated intelligent home applicants.Because of the limitation of embedded device's memory space and processor speed,the real-time processing efficiency is significantly constrained when comes to complex speech synthesis algorithm.Even more,most Embedded speech synthesis devices can only synthesize a single voice message.The thesis apply statistical parametric speech synthesis technology to embedded devices to realize a hidden Markov model(HMM)-based speech synthesis for embedded system.Because HMM-based speech synthesis is one of the parametric speech synthesis method,the back-end of speech synthesis has little dependence on corpus.The acoustic models and speech synthesis algorithms can easily transplanted to the embedded system after training the acoustic models with training corpus on PC.Because the trained HMM models are small,it is applicable to the embedded system.The thesis takes the Feiling OK6410 development board as hardware equipment to realize speech synthesis based on the S3C6410 Processor of ARM11.The main works and contributions of the thesis are as follows:Firstly,the thesis realizes the training of HMM-based acoustic models on PC.We setup a HMM-based speech synthesis framework on PC server.Then we training the context-dependent acoustic models,which includes fundamental frequency models,spectrum models,duration models and decision tree,with a large training corpus.These models are downloaded to the FLASH of the development board for the back-end speech synthesis.Secondly,the thesis transplants the algorithms and acoustic models to the embedded system for back-end speech synthesis.The Linux operating system is firstly established on the development board.Then the thesis transplant text analysis module,parameters generation module and mel log spectrum approximation(MLSA)filter module.The text analysis module,which is compiled as a separated library for calling by other modules,is used to generate context-dependent labels of input text.The parameters generation module is used to generate speech parameters including exciting parameters of F0 and spectrum parameters of mel generalized coefficients(MGC)from sentence acoustic models that is obtained by composing context-dependent speech synthesis unit models selected from decision trees according to the context-dependent labels.The MLSA filter is employed to generate synthesized speech from exciting F0 and filter MGC.Finally,the thesis conduct objective evaluations and subjective evaluations to test the performance of the developed embedded system based speech synthesis.Testing results show that our system can synthesize high quality speech on embedded system.
Keywords/Search Tags:Speech Synthesis, HMM Statistical Parameter, Embedded Device, Linux operating system, ARM
PDF Full Text Request
Related items