A rule-based system for identifying pleonastic 'It' |
Posted on:2011-09-17 | Degree:M.S | Type:Thesis |
University:University of Maryland, Baltimore County | Candidate:Johnson, Benjamin | Full Text:PDF |
GTID:2448390002964105 | Subject:Computer Science |
Abstract/Summary: | |
This work improves upon existing rule-based methods for identifying pleonastic (nonreferential) 'it' in natural language texts. Rule-based methods attempt to match instances of 'it' against a set of patterns whose components are formulated in terms of part-of-speech markers and word lists. We created a system that incorporates previously discovered patterns, new patterns and extensions of the word lists used in previous work. We evaluated our system by seeking pleonastic 'it' in the modified subset of the British National Corpus (BNC) used in Boyd et al. (2005) and an automatically generated, manually annotated web corpus. |
Keywords/Search Tags: | Rule-based, Pleonastic, 'it', System |
|
Related items |