Font Size: a A A

Research Of Chinese Ellipsis Identification

Posted on:2012-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:G Q YangFull Text:PDF
GTID:2218330368992448Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The phenomenon of ellipsis is widely existed in Chinese. Ellipsis identification is an important research field in Natural Language Processing and can be applied to many related fields, such as Machine Translation,Text Categorization,Information Extraction, etc. Currently,there is little related research focused on Chinese ellipsis identification and they just focus on its theory and do not apply the theory to the identification method and system. This thesis analyzed and summarized the existing Chinese ellipsis identification techniques and it is organized as follows:Firstly, for the lack of Chinese ellipsis identification Corpus, we have labeled the CTB corpus manually, and then construct a corpus for Chinese ellipsis identification. Then we analyze the syntactic structure of Chinese ellipsis identification and obtain six syntactic structures for Chinese ellipsis. At last, we analyze issues and solutions for Chinese ellipsis identification.Secondly, we discuss a rule-based approach to indentify ellipsis in Chinese.We analyze different verb forms and the situation of ellipsis position,and then propose a verb-driven method to extract rules. Experimental results shows that our method is feasibility.According to the character of the Chinese ellipsis identification relying on context information, we employ the SVM-based convolution tree kernel to capture structural information and we focus on how to get syntax structure tree. Experimental results show that the tree kernel-based approach of Chinese ellipsis identificationis better than rule-based appraoch.
Keywords/Search Tags:Ellipsis, Ellipsis identification, Rules, Verbs, SVM, Convolution Tree Kernel
PDF Full Text Request
Related items