Font Size: a A A

Prepositional Phrase Attachment Disambiguation Of Natural Language Processing

Posted on:2011-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:B S LiaoFull Text:PDF
GTID:2208360308966259Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Ambiguity is a linguistic phenomenon in natural language processing, which is also frequently encountered in parsing a sentence. In other words, using computers to automatically parse a sentence, there are various kinds of parsing trees since one phrase or clause attaching to two or more than two attachment sits in the sentence. This issue is raised not only in Chinese parsing task but also in English. In this thesis, we mainly explore prepositional phrase (PP) attachment disambiguation; namely a binary classification task, in which the goal is to classify N or V, corresponding to the prepositional phrase'noun or verb attachment, given a 4-tuples.First of all, we introduce both the foreign and domestic status and progress of research on natural language processing, as well as the background and theoretical bases of our research subject. Then we give a general overview of the subject, PP-attachment disambiguation. In the second chapter, we present the n-gram model and several data smoothing methods briefly. The third chapter gives a detailed description of word sense disambiguation, especially of the PP-attachment disambiguation task. In the forth chapter, we exploit the main approaches for PP-attachment disambiguation, including Bayesian, ME, SVM, back-off and et al. And then we focus on a bi-threshold model for PP-attachment disambiguation through backing off to 2-tuples directly. The model was tested in IBM data sets for PP-attachment with 85.02% accuracy and 100% recall. The experiment shows that our model is with solid theoretical bases and computational inexpensive to implement; 2-tuples have adequate decision information to resolve PP-attachment as well.Finally, we analyze the advantages and disadvantages of main disambiguation models in comparison with our model. And we present a potential improvement which may raise performance further.
Keywords/Search Tags:word sense disambiguation, prepositional phrase attachment disambiguation, back-off model, bi-threshold, binary classification
PDF Full Text Request
Related items