Font Size: a A A

Enumeration of general t-ary trees and universal types

Posted on:2011-02-17Degree:Ph.DType:Dissertation
University:University of Illinois at ChicagoCandidate:Zhang, ZhilongFull Text:PDF
GTID:1440390002964194Subject:Mathematics
Abstract/Summary:
Trees are the most important and fundamental data structure used in computer science. The number of t-ary trees with n nodes is known as the generalized Catalan number. Various questions about the statistics of binary trees were investigated in the past, such as height and path length distributions for a randomly selected tree. Seroussi first posed a problem motivated by information theory in the study of universal types of sequences: what is the number of trees of a given path length (the path length is the sum of the depths)? Two sequences of the same length p are said to be of the same universal type if and only if they produce the same set of phrases in the incremental parsing of the Lempel-Ziv'78 scheme, which is a commonly used data compression scheme. Seroussi observed that the number of possible parsings of sequences of length p corresponds to the cardinality of Tp which is defined as the collection of t-ary trees that have path length p. Knessl and Szpankowski studied the enumeration of binary trees (t = 2) and universal types. Seroussi first conjectured and later proved that for t-ary trees, as p → infinity, Tp =expht-1 tlntplnp 1+s1 where h(x) = - x ln x - (1- x) ln (1 - x).;In this dissertation, we denote the number of t-ary trees with n nodes and path length p by g(n, p). We shall study the limit n → infinity and various ranges of p such as p = ( n2 ) - O(1), p = O( n2), p = O( n3/2), p = O( n4/3) and p = n logt n + O( n). Our goal is to obtain a thorough understanding of the double sequence g(n, p) when n and p are large. The distribution of the trees by path length alone, or number of nodes alone, will follow as special cases. We derive various limiting distributions for the number of nodes when a tree is selected randomly from Tp . As a special case we compute the o(1) error term in Seroussi's formula, which involves the Airy function. We use methods of analytic algorithmics such as generating functions, recurrence relations and complex asymptotics. We also use methods of applied mathematics such as the WKB method and asymptotic matching.
Keywords/Search Tags:T-ary trees, Path length, Universal
Related items