Towards efficient statistical parsing using lexicalized grammatical information

Posted on:2003-10-21

Degree:Ph.D

Type:Dissertation

University:University of Delaware

Candidate:Chen, John

Full Text:PDF

GTID:1468390011980270

Subject:Computer Science

Abstract/Summary:

Many natural language understanding systems require efficient and accurate parsing disambiguation to be effective. State of the art parsers owe their high performance in large part to statistical modeling of lexical features. Although lexicalized tree adjoining grammar (TAG) is a lexicalized grammatical formalism for natural language, its use in statistical parsing has remained relatively unexplored. In this work, I aim to develop statistical models for TAG parsing that are both efficient and accurate. First, I explore the issue of linear time TAG parsing disambiguation (supertagging). Previously, only local structural information was found to be effective for supertag disambiguation. I show that long distance information as well as lexical information can also be useful for accurate supertagging. Furthermore, I develop frameworks that use these features to significantly increase the accuracy of supertagging. Second, in order to provide a robust resource for statistical processing models of TAG, I develop and evaluate procedure to extract TAGS from widely available treebanks. I then develop other procedures to organize these extracted TAGS as well as to link them to other TAGs. Third, I explore smoothing approaches for TAG, which is essential because of the inherent data sparseness problem for statistical processing models of TAG. One main approach uses the idea of distributional similarity in smoothing while another approach uses the large scale organization of TAG for smoothing. Both show promise for smoothing statistical processing models of TAG.

Keywords/Search Tags:

Statistical, TAG, Parsing, Efficient, Information, Lexicalized, Smoothing

Related items

1	Research On Chinese Syntactic Parsing Based On Lexicalized Statistical Model
2	Semantic parsing using lexicalized well-founded grammars
3	The Study On Data Augmentation In Chinese Parsing
4	The Smoothing Technique Based On Mutual Information For Statistical Language Model
5	Combining labeled and unlabeled data in statistical natural language parsing
6	Efficient combinator parsing for natural-language
7	Research Of Chinese Stentence Skeleton Parsing Base On Statistical Model
8	A Study On Mongolian Statistical Parsing
9	Statistical LTAG parsing
10	Well-foundedness and reliability in statistical natural language parsing