Font Size: a A A

Pyramidal digest: An efficient model for abstracting text databases (Machine learning, Database index)

Posted on:2002-04-01Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Chuang, Wesley TFull Text:PDF
GTID:1468390011997572Subject:Computer Science
Abstract/Summary:
We present a novel model of automated composite text digest, the Pyramidal Digest. The model integrates traditional text summarization and text classification in that the digest not only serves as a “summary” but is also able to classify text segments of any given size, and answer queries relative to a context.; “Pyramidal” refers to the fact that the digest is created in at least three dimensions: scope, granularity, and scale. The Pyramidal that are obtained gradually—from specific to general, and from large to small text segment size—through a combination of shallow parsing and machine learning algorithms. There are three noticeable threads of learning taking place: learning of characteristic relations, rhetorical relations, and lexical relations.; Our model provides a principle for efficiently digesting large quantities features. This approach scales, with complexity bounded by O( n log n), where n is the size of the text. It offers a standard and systematic way of collecting as many semantic features as possible that are reachable by shallow parsing. It enables readers to query beyond keyword matches. Finally, it shares the insight of Gestalt philosophy that the roles of discrete syntax and semantics diminish in importance as progressively learned, which can serve to “understand” (or digest) the whole text.
Keywords/Search Tags:Text, Digest, Model, Pyramidal
Related items