Font Size: a A A

Extremal Entropy: Information Geometry, Numerical Entropy Mapping, and Machine Learning Application of Associated Conditional Independences

Posted on:2017-05-11Degree:Ph.DType:Thesis
University:Drexel UniversityCandidate:Liu, YunshuFull Text:PDF
GTID:2458390008982086Subject:Electrical engineering
Abstract/Summary:
Entropy and conditional mutual information are the key quantities information theory provides to measure uncertainty of and independence relations between random variables. While these measures are key to diverse areas such as physics, communication, signal processing, and machine learning, surprisingly there is still much about them that is yet unknown. This thesis explores some of this unknown territory, ranging from tackling fundamental questions involving the interdependence between entropies of different subsets of random variables via the characterization of the region of entropic vectors, to applied questions involving how conditional independences can be harnessed to improve the efficiency of supervised learning in discrete valued datasets.;The region of entropic vectors is a convex cone that has been shown to be at the core of many fundamental limits for problems in multiterminal data compression, network coding, and multimedia transmission. This cone has been shown to be non-polyhedral for four or more random variables, however its boundary remains unknown for four or more discrete random variables. We prove that only one form of nonlinear non-shannon inequality is necessary to fully characterize the region for four random variables. We identify this inequality in terms of a function that is the solution to an optimization problem. We also give some symmetry and convexity properties of this function which rely on the structure of the region of entropic vectors and Ingleton inequalities. Methods for specifying probability distributions that are in faces and on the boundary of the convex cone are derived, then utilized to map optimized inner bounds to the unknown part of the entropy region. The first method utilizes tools and algorithms from abstract algebra to efficiently determine those supports for the joint probability mass functions for four or more random variables that can, for some appropriate set of non-zero probabilities, yield entropic vectors in the gap between the best known inner and outer bounds. These supports are utilized, together with numerical optimization over non-zero probabilities, to provide inner bounds to the unknown part of the entropy region. Next, information geometry is utilized to parameterize and study the structure of probability distributions on these supports yielding entropic vectors in the faces of entropy and in the unknown part of the entropy region.;In the final section of the thesis, we propose score functions based on entropy and conditional mutual information as components in partition strategies for supervised learning of datasets with discrete valued features. Partitioning the data enables a reduction in the complexity of training and testing on large datasets. We demonstrate that such partition strategies can also be efficient in the sense that when the training and testing datasets are split according to them, and the blocks in the partition are processed separately, the classification performance is comparable to, or better than, the performance when the data are not partitioned at all.
Keywords/Search Tags:Entropy, Information, Conditional, Random variables, Entropic vectors
Related items