Font Size: a A A

New results in probabilistic modeling

Posted on:2002-08-09Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (People's Republic of China)Candidate:Chan, Ho LeungFull Text:PDF
GTID:2460390011495053Subject:Statistics
Abstract/Summary:
The class of factorization models is a class of common and useful models for positive functions P(x1,..., xn). It can be used to characterize the conditional independence relations of random variables X1,..., Xn with probability distribution P. It can also be used to characterize various graphical models of P and be used to construct an efficient method to store the function values of P. The first part of this thesis is regarding factorization of positive functions. There are several problems we will tackle in this part of the thesis. We are given a positive function/probability distribution P(x1,...,xn). First, we want to determine which factorization models P belongs to. Second, given a factorization model LogMS , we want to find a "good" approximation of P such that the approximation is in the given factorization model LogMS . Third, we want to devise an automatic factorization model selection algorithm that find the "simplest" factorization model of P.; For the first problem, we introduce a transformation {lcub} FyaP :a⊆N {rcub} of P. By identifying which component function FyaP of the transformation is equal to the all-one function, we determine which factorization models P belongs to. For the second problem, we propose a computationally efficient algorithm to find a sub-optimal approximation of P. For the third problem, we devise an automatic model selection algorithm that can return an approximation of P in the simplest factorization model without introducing an error larger than what we specify.; In the second part of the thesis, we are given a set of tables T1,...,Tk and we want to find a maximum likelihood estimate of the underlying probability distribution of random variables X1,...,X n given the k tables. There are two questions in which we are particularly interested. The first question is how to find a maximum likelihood estimate for the underlying probability distribution P. The second question, a generalization of the first question, is how to find a maximum likelihood estimate for the underlying probability distribution P such that our estimate is in a given factorization model. To solve the two questions, we present two new classes of divergence minimization problems. In the first, divergence minimization problem, given a set of linear subsets E1,...,Ek of non-negative functions, we are required to find a non-negative function Q that is "closest" to E1,..., Ek. In the second problem, we want to find an approximation Q in a given factorization model such that it is "closest" to linear subsets E1,...,Ek . We then propose the Iterative Minimization algorithm (IM algorithm) and the generalized IM algorithm to solve the two minimization problems. Finally, we propose the Divergence Minimization approach that transform a maximum likelihood estimation problem into a divergence minimization problem. Then we can use the IM algorithm and the generalized IM algorithm to solve the maximum likelihood estimate problem.
Keywords/Search Tags:Model, IM algorithm, Maximum likelihood estimate, Problem, Underlying probability distribution, Divergence minimization, Function
Related items