Font Size: a A A

Parallel algorithms for a highly unstructured problem: Natural language parsing using tree adjoining grammar

Posted on:1998-07-24Degree:Ph.DType:Dissertation
University:University of MinnesotaCandidate:Nurkkala, Thomas BenneyFull Text:PDF
GTID:1468390014478485Subject:Computer Science
Abstract/Summary:
Unstructured problems exhibit highly unpredictable computation and communication patterns when executed on a parallel multicomputer. As a result, such problems are very difficult to parallelize. In this dissertation, we explore a paradigmatic example of such an unstructured problem: parsing natural language using Tree Adjoining Grammar (TAG).; Parsing is foundational to any computational natural language processing system. Tree Adjoining Grammar provides a powerful grammatical formalism that allows many linguistic constructs to be expressed elegantly and succinctly. However, algorithms for TAG parsing incur considerably greater computational complexity than algorithms for parsing other natural language formalisms (such as context free grammar). Thus, TAG parsing is not only an example of a highly unstructured problem but also a problem whose effective parallelization promises to advance the state of the art in natural language processing by reducing the run time of TAG parsing to a practical duration.; In this dissertation, we explore a variety of new parallel algorithms for TAG parsing and their implementations on distributed-memory, shared-memory, and shared-address-space parallel multicomputers. In each case, our implementations exhibit the highest performance of any TAG parsers known. The P scARALLEL CT scAG parser is the fastest, full-function TAG parser, faster than the best parser offering similar functionality by a factor of over 200. We present each algorithm and its implementation iteratively, demonstrating the efficacy of various enhancements in the implementations that overcome sources of parallel overhead and improve performance.; As a paradigmatic instance of an unstructured problem, the general techniques and principles exposed in the process of parallelizing TAG parsing should also be applicable to the parallelization of other unstructured problems. We thus present not only an extensive examination of the experimental performance of various techniques for enhancing the performance of parallel algorithms for an unstructured problem, but we generalize these techniques and capture them in succinct, widely applicable design heuristics that can be applied to the parallelization of other such unstructured problems.
Keywords/Search Tags:Unstructured problem, Parallel, Natural language, Parsing, Tree adjoining, Highly, Grammar
Related items