Font Size: a A A

Research On Predicting Chinese Prosodic Boundary Based On Syntactic Features

Posted on:2014-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhuFull Text:PDF
GTID:2268330422959716Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of the science and technology, high level naturalness ofspeech synthesis has become an important research topic in the artificial intelligence,speech signal processing, and human-computer interaction field. At present, the researchersput the research of speech synthesis technology focuses on Chinese text-to-speechconversion system (CTTS), and the input text by voice processing rules automaticallyconverted into an acoustic signal output. In order to predict the prosodic boundary of theinput text information more accurately to improve the naturalness of the output speech, thethesis established a text corpus, the statistical analysis of the grammatical features, syntacticfeatures and prosodic structure of the relationship, and then comparative analysis of theimpact of various parameters that including both the part of speech, word length andcontiguity on the prediction of prosodic boundary, and ultimately make use of TBLalgorithm to achieve the prediction of Chinese prosodic boundary. The results of theresearch paly an important theoretical significance and application value in revealing therelationship between text and voice, and improve the degree of naturalness of synthesizedspeech. Main achievements and originalities are as follow:Firstly, we designed and built a large Chinese text corpus with syntactic information.Using web-oriented XML-based Chinese information processing platform-languagetechnology platform LTP, we established a large Chinese text corpus with syntacticinformation. The corpus contains approximately10,000standard syntax Chinese text corpus,the average sentence length of that are52words. Make use of the language technologyplatform, we split sentence syntactic structure earily, and then grammatical information,prosodic information and syntactic information are labeled manually under the guidance ofexperts of linguistics. Marked results through random checks of experts have reached allkinds of the scientific research requirements, which can be used for prediction of prosodicboundary.Secondly, the statistical analysis of both the grammatical features, syntactic featuresand prosodic structure, and then contiguity was proposed for predicting Chinese prosodicboundary, which was a new features. The results of Statistical analysis showed that textinformation about both the grammatical syntactic level and prosodic structure of thesentence in the text corpus all not alone, which have close relevance. In this thesis, a new concept, which was called contiguity (Adjacent degree AD), was proposed to describe thesyntax of the text corpus grammatical words in the sentence level, and as a new rhythmstructure prediction parameters, it reflects the relationship between syntactic structure andprosodic structure. Therefore, we added contiguity to the marked Chinese corpus forreflecting syntactic features.Thirdly, in this thesis, both the features of the part of speech, word length andcontiguity were proposed as the parameters of prosody predicting, and we compared thedifferent significances of them, which were used to predict prosodic boundary prediction.the analysis of the Chinese corpus text with feather information showed that: the correctprediction of the prosodic boundary not only relay on adjacent of the grammatical, but alsois closely related to its syntactic features. Therefore, the thesis selected the part of speech,the length of word and contiguity as the important parameters of predicting prosodicboundary.finally, a noval Statistical learning algorithm named TBL algorithm was proposed,which can effectively predict the prosodic boundary. The TBL algorithm is a kind oftransformation-based error-driven learning algorithm. This machine learning algorithms canautomatically learn the new rules when the the artificial rule templates are not adapted, andthen add those new rules to the existed rule templates. Experimental results show that theprosodic word prediction accuracy on the test set have reached98.4%, and the predictionaccuracy of the phrase have reached82.7%, both of them are better than the existed similarresearch results.
Keywords/Search Tags:the prosodic boundary, grammatical structure, syntactic structure ofprosodic, words the phrase, contiguity, TBL algorithm, natural language processingtechnology platform LTP
PDF Full Text Request
Related items