The Research Of Dai Language Segmentation Method Based On CRF

Posted on:2016-05-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhang

Full Text:PDF

GTID:2298330470454939

Subject:Electronics and Communications Engineering

Abstract/Summary:

Segmentation technique is an important round of front-end text analysis in speech synthesis system, and the segmentation of corpus text could make the synthesis speech have a better naturalness, which decides whether the synthesised speech obey humanâ€™s pronouncing rules or sounds fluently. There is a natural delimiter between most of the Indo-Euro languages, which enables people to distinguish the word boundary easily. While in the Dai language text, there does not exist such a boundary, so this paper talks about is how to ensure the word boundary in a large paragraph of Dai language. There are many ways to segmentation at present, while generally speaking, there are only two main methods:machine segmentation and statistics-based segmentation. The former has the relatively lower accuracy, and the segmentation speed relies on the scale of dictionary as well, which would make the result unsatisfactory. As a result, achieving the segmentation of the Dai language by adopting statistics methods became a worth-studying question.This paper adopted conditional random field model to study Dai language segmentation, works finished are as follows:1, Stating the role that segmentation played in speech synthesis and introduced the two segmentation methods mentioned above by referring to Chinese and English segmentation methods.2, By contrasting three common statistics models, HMM, MEMM and CRF, this paper states the advantages of CRF when adopted in Dai language marking and segmentation.3, Set initials and finals as feature items and summarize Dai characters, constructing Dai dictionary and write C++program to make preliminary marks on feature items and location information.4, In CRF platform, the practice and predictive segmentation were achieved, combining dynamic link library and then transplanting segmentation algorithm in Visual Studio2010platform, and giving the result.The result of the experiment shows that, adopting conditional random field model in Dai text segmentation would gain a higher accuracy, besides, in the respect of precision, the accuracy P was91.05%, recalling93.2and FBI92.34%, which met the basic requirements of Dai language segmentation and enable the speech synthesized a better naturalness.

Keywords/Search Tags:

Speech synthesis, Segmentation, CRF model, Feature items selection

Related items

1	Research On Statistical Acoustic Model Based Speech Synthesis
2	Research On Statistical Acoustic Model Based Unit Selection Speech Synthesis Method
3	On The Selection Of Scale Items Method Based On Feature Selection
4	Research On Unit Selection Concatenation Speech Synthesis Method Based On Deep Learning
5	Research On Speech Synthesis Method Integrating Subjective Evaluation And Feedback
6	A Research On Speech Synthesis Based On Statistical Modeling And Pronunciation Error Detection
7	A Study On Speech Synthesis And Visual Speech Synthesis Based On Neural Networks
8	Deep Neural Network-based Acoustic Signal Synthesis And Separation Research
9	Research On Method Of Unit Selection Speech Synthesis Based On Hidden Markov Model
10	Research On Automatic Segmentation Technology And Automatic Segmentation Of Speech In Dai Language Speech Synthesis System