Font Size: a A A

Dialog Act Classification In Chinese Spoken Language And Its Application Under The Internet

Posted on:2015-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:P LiuFull Text:PDF
GTID:2298330452959570Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, there appears lots of oralinformation on the internet everyday (blog, microblogging, chat logs, etc). How to usethe computer to automatically process the information, analyze its semantics andintent, is a serious problem. Though traditional natural language processingtechnology has got great archivement in syntax analysis, it gets little breakthroughs insemantics and intent analysis. In this study, firstly we propose three methods for DAtagging in Chinese spoken language, and then based on sina weibo we construct theChinese spoken language corpus, and lastly in this corpus we conduct the DA taggingexperiments using the methods we have proposed.Dialog Act (DA) is an important pragmatics feature for us to understand speakers’intention[1,2]. In this work, we propose three methods for the Chinese DA taggingproblem. The n-gram method, the HMM and extended HMM method, theKNN+n-gram method. The results show that the proposed methods are well suited tothe task.The Chinese spoken language has its own features under the internet. Can we usethe methods we proposed before for the DA tagging under the internet? To answer thisquestion, we firstly construct the Chinese spoken language corpus based on thesinaweibo API. The corpos focus on the house rate information, it contains about500sentences.In the end, we carry out the DA tagging experiment under this corpos. The resultsshow that the methods we proposed are field-independent. They are well suited to theDA tagging problem under the internet.
Keywords/Search Tags:Internet, weibo, spoken language, Dialog Act n-gram, HMMknn
PDF Full Text Request
Related items