Font Size: a A A

Research On Methods Of Text Analysis For Tibetan Statistical Parametric Speech Synthesis

Posted on:2018-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:X J KongFull Text:PDF
GTID:2428330515995575Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
In the study of speech synthesis,the method based on HMM statistical parameters speech synthesis is not only applied in Chinese and other mature languages in recent years,but also gradually began to be applied in the linguistic study of Tibetan and other minority languages.Text analysis has a great influence on the naturalness and intelligibility of synthetic speech.However,at the present stage,the research of front-end text analysis is often overlooked in the process of current Tibetan speech synthesis,so the study of Tibetan text analysis is still staying in an initial stage.Both the study of training stage and the synthesis stage of the method need to have a necessary text analysis stage,in the training stage we mainly need to acquire mono-phone labeling and context-dependent labeling information which is needed by the back-end synthesis stage through the text analysis for input Tibetan text,in the synthesis stage we also need to obtain the context-dependent labeling information through the text analysis for input Tibetan text.And the mono-phone labeling mainly refers to the initials and finals information of input Tibetan text,and the context-dependent labeling information mainly means the detailed labeling and location information of the initials/finals layer,the syllable layer,the word layer,the phrase and sentence layer The purpose of this paper is to obtain the mono-phone labeling and context-dependent labeling information of the input Tibetan text through text analysis,and realizing the replacement of manual label through the design of auto label.The main work and innovations of this paper are as follows:Firstly,design of Tibetan Lhasa dialect corpus.Tibetan belongs to the Sino-Tibetan language family,it is also a commonly used minority language.In this paper,the study is mainly aimed at the typical representative of Tibetan Lhasa dialect,we built a 800 sentence small Tibetan corpus and a 2000 sentence large corpus through a large number of Tibetan corpus selection.Meanwhile,we invited two Tibetan male students and two Tibetan female students from nationalities university of our school to record in a professional recording studio to ensure we can obtain high quality voice in the back-end synthesis stage.Secondly,through the screening of Tibetan corpus,a Tibetan dictionary library containing Tibetan initials/finals,syllables,words and phrases was designed which would be used in the process of using the forward maximum matching algorithm.The purpose of this paper is to acquire mono-phone labeling and context-dependent labeling information of the input Tibetan text,because there is a special Tibetan symbol between every two Tibetan syllables or every two Tibetan sentences,so we can design algorithm to obtain the syllable information and sentence information of input Tibetan text based on this feature,we can acquire the initials/finals information through the decomposition of Tibetan character.In this paper,we designed the maximum matching algorithm based on dictionary library to obtain the words and phrases information of the input Tibetan text.In the acquisition of the context-dependent labeling,we designed a labeling format of five layers,meanwhile,designing an algorithm to acquire the context-dependent labeling information of input Tibetan text.Finally,we had a speech synthesis to the Tibetan text with the statistical parametric information that we have obtained through the process of text analysis through HTS.Finally,in order to test the quality of synthetic speech,we have a MOS and DMOS subjective evaluation to synthetic Tibetan speech.Under the 800 corpus,the average score of the synthetic speech with MOS and DMOS were 3.0 points and 3.3 points,which indicated that the text analysis methods used in this paper is feasible for naturalness and similarity of the synthetic speech.
Keywords/Search Tags:text analysis, HMM, speech synthesis, mono-phone labeling, contextdependent labeling
PDF Full Text Request
Related items