Font Size: a A A

Chinese Function Blocks Analysis And Applied Research,

Posted on:2010-04-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:C X YuanFull Text:PDF
GTID:1118360278465459Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As researchers improve results on various other problems in "pure" natural language processing (e.g. part-of-speech tagging, parsing), those who work in the more "applied" NLP fields (e.g. question-answering, information extraction) are seeking more powerful sorts of linguistic annotation as input for their own systems. Function tags are a context-sensitive annotation applied to words and phrases of natural language text, marking their syntactic or semantic role within a larger utterance.In this thesis we develop a sequential predication model for Chinese function tag labeling. We will show that this method provides state-of-the-art accuracy, yielding an F1 score of 93.76, is extensible through the feature set and can be implemented efficiently. Furthermore, we display the specific properties of Chinese function tags by comparing it with English as well as show its practical applicability through integration into an opinion holder recognition system.In the first part of the thesis, we present the problem of function tag labeling: why it is an interesting problem, who else has worked on similar thing, and what exactly we intend to do. Then we will briefly review the datasets we are working on - the Penn Chinese Treebank, and explain the specific metrics by which we will evaluate our work.In the second part of the thesis, we will present a sequential predication model. This will lead to the heart of the thesis - automatic function tag labeling. Here we formulate function tag labeling as a sequence learning problem within structural spaces, yielding state-of-the-art accuracy and high robustness. Then we will present an analysis of what features prove to be the most helpful for Chinese function tag assignment and why we think it will be useful in this task, and introduce two totally different function labeling systems, one assigning function tags to unparsed text using simple lexical features (word, part-of-speech tag, etc), and one assigning function tags to the output of parsed text using features collected from the full parsed trees (phrase type, tree path, etc). We then discuss the advantages and disadvantages of each system in various situations . We also compare our function tagger to other state-of-the-art systems.Finally, in the third part of the thesis, we present how this work improves the applications of text opinion mining. We will introduce our primary work on opinion holder recognition by using function tags as clues, to show its applicability to a real world problem. Lastly, we will present a comparison to other systems performing related tasks, and speculate on some interesting future work.The proposed work has defined Chinese function tags from the view of computation and yielded an automatic Chinese funtion tag labeler. The research results are directive and with reference value to other related work. In addition, the experiment suggests the promising application of function tags.
Keywords/Search Tags:natural language processing, comprehensive information, function tag labeling, machine learning, support vector machines, sequential prediction model, text opinion mining
PDF Full Text Request
Related items