Font Size: a A A

CNN-Based Articulatory Feature Recognition For Kunqu-Singing Pronunciation Evaluation

Posted on:2024-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:M T HuangFull Text:PDF
GTID:2555307076991379Subject:Engineering
Abstract/Summary:PDF Full Text Request
Kunqu originated in the Ming and Qing eras and was hailed as the“ancestors of the hundred plays”.With the progress and development of time,Kunqu has gradually entered the public’s view,and many Kunqu enthusiasts have developed a deep interest in Kunqu-singing.Kunqu has the“Three Excellencies”(三绝),which are clear pronunciation(字清),clean tune(腔纯),and correct rhyme(板正)and clear pronunciation is the foundation of Kunqu-singing.Due to the preservation of ancient Chinese pronunciation rules in Kunqu-singing,it has many differences from Mandarin,such as the distinction of“Jiantuan”(尖团),the emphasis on the integrity of the“Jieyin”(介音)and the pronunciation of the“Yuntou”(韵头),“Yunfu”(韵腹),and“Yunwei”(韵尾).The“Yinyang”(阴阳),“Qingzhuo”(清浊),“Gong Shang Jue Zhi Yu”(宫商角徵羽),and the rhyme used in the Qing-era book“Yun Xue Li Zhu”(韵学骊珠)are difficult to clearly pronounce,especially for beginners,and the system is not user-friendly.Pronunciation accuracy is a great challenge for both amateur and professional Kunqu singers.Currently,learning Kunqu-singing mainly relies on“oral transmission”(口传身教)which,to some extent,is not conducive to the promotion of Kunqu.In order to more widely transmit and promote the charm of Kunqu‘Shui Mo Diao’(水磨调),this article uses modern phonetics methods to systematically represent the initials and rhymes of Kunqu-singing,and clarify their articulatory features.Based on this,using algorithms and methods for automatic speech recognition and music information retrieval,a Kunqu-singing pronunciation recognition model based on articulatory features is established to perform automated evaluation of pronunciation similarity.The aim is to provide help for ordinary Kunqu singers with pronunciation correction.Firstly,this article draws on previous research and categorizes the pronunciation of Kunqu-singing according to the classification system of“Yun Xue Li Zhu”.Using modern phonetics methods,a phonotactic system based on the IPA is established for the phoneme-level of Kunqu-singing.It includes 50 initials and 29 rhyme classes in total,including 21“Shu rhyme”(舒声韵)and 8“Ru rhyme”(入声韵),each with its IPA representation and phonetic features.This system facilitates the mastering of pronunciation techniques by Kunqu performers and lays the foundation for the automatic recognition of Kunqu-singing in the next step.Secondly,this article establishes an overall framework for recognizing articulatory features of Kunqu-singing,divided into two models:the recognition model for“Jiantuan”and the recognition model for single vowel of rhyme.A self-built Kunqu corpus based on music information retrieval technology is created,and rough-grained segmentation of lyrics is performed based on the“Gong-Che Notation”(工尺谱)。Additionally,manual annotation and verification of phonemes are conducted.Thirdly,to address“Jiantuan”pronunciation error issue in the initial pronunciation of Kunqu,a CNN-based model is established for recognizing“Jiantuan”in the initial pronunciation of Kunqu,utilizing the self-built Kunqu corpus for model training and optimization.Experimental results show that the recognition accuracy of articulatory features is higher than that of phoneme recognition for“Jiantuan”and“Juanshe”(卷舌),respectively,by35.6%,38.3%,and 30.4%.After adjusting the structure of the CNN,the average recognition accuracy of the articulatory feature model reaches 88.6%.Fourthly,in the rhyme detection process of Kunqu-singing,special attention is paid to the detection of the height of single vowels,such as“Ji Wei rhyme”(机微韵),“Zhi Shi rhyme”(支时韵)and“Ju Yu rhyme”(居鱼韵),as well as the detection of oral vowels that are not sung properly,such as“Jieyin”.To address the issue of insufficient sample size,this paper introduces a Jingju corpus for transfer learning,and carried out frame-level recognition of single vowels of Kunqu rhyme articulatory features without using forced alignment.Experimental results show that the average recognition accuracy of the model after transfer learning is improved by 8.8%,reaching 81.4%.Furthermore,this paper compares the recognition effectiveness of single-layer CNN and double-layer CNN models.Although the average recognition rate of the double-layer CNN model is only 0.7%higher than that of the single-layer CNN model,its convolutional kernel number,is reduced by 40%,reducing the complexity of the model to some extent.Finally,we collected audio recordings of learners’ Kunqu-singing,and used the trained model to predict articulatory features of the initials and single vowels of Kunqu rhyme.We compared with the performances of professional performers of Kunqu-singing,and analyzed the pronunciation errors of learners and the recognition performance of the model through examples.
Keywords/Search Tags:articulatory feature, transfer learning, convolution neural network, pronunciation of Kunqu, pronunciation error detection
PDF Full Text Request
Related items