Font Size: a A A

A computational study of lexicalized noun phrases in English

Posted on:2003-02-04Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Godby, Carol JeanFull Text:PDF
GTID:1465390011983655Subject:Language
Abstract/Summary:
Lexicalized noun phrases are noun phrases that function as words. In English, lexicalized noun phrases are usually realized as noun-noun compounds such as theater ticket and garbage man, or as adjective-noun phrases such as black market and high school. In specialized or technical subject domains, phrases such as urban planning, air traffic control, highway engineering and combinatorial mathematics represent conventional names for concepts that are just as important to the as single-word terms such as adsorbents, hydrology, or aerodynamics. Yet despite the fact that lexicalized noun phrases are frequent enough to be cited in dictionaries, book indexes, the traditional linguistic literature has failed to identify consistent and categorical formal criteria for identifying them.; This study develops and evaluates a linguistically natural computational method for recognizing lexicalized noun phrases in a large corpus of English-language engineering text by synthesizing the insights of studies in traditional linguistics and computational linguists. From the scholarship in theoretical linguistics, the analysis adopts the perspective that lexicalized noun phrases represent the names of concepts that are important to a community of speakers and have survived a single context of use. Theoretical linguists have also proposed diagnostic tests for identifying lexicalized noun phrases, many of which can be formalized in a computational study. From the scholarship in computational linguistics, the analysis incorporates the view that a linguistic investigation can be extended and verified by processing relevant evidence from a corpus of text, which can be evaluated using mathematical models that do not require categorical input.; In a engineering text, a small set of linguistic contexts, including professor of, department of or studies in, yields state machines, complex systems, computer graphics, and mathematical morphology. The study reported here identifies lexical and syntactic contexts that harbor lexicalized noun phrases and submits them to a machine-learning algorithm that classifies the lexical status of noun phrases extracted from the text. Results from several evaluations show that this evidence is relevant to the classification, and informal evidence from many other subject domains implies that the results can be generalized.
Keywords/Search Tags:Lexicalized noun phrases, Computational, /italic
Related items