Model checking for incomplete high-dimensional categorical data (Incomplete data)

Posted on:2000-07-04

Degree:Ph.D

Type:Dissertation

University:University of California, Los Angeles

Candidate:Hu, Ming-Yi

Full Text:PDF

GTID:1468390014460658

Subject:Statistics

Abstract/Summary:

Categorical data are often arranged in a contingency table and summarized by a loglinear model. A standard approach for comparing two competing models is to calculate twice the discrepancy between maximized loglikelihoods, which follows a χ2 distribution asymptotically. But when data are sparse, the χ2 approximation may be questionable.; As an alternative to a large-sample approximation to the reference distribution, we implement the framework introduced by Rubin (1984) for finding the posterior predictive check (PPC) distribution. The PPC distribution represents the conditional probability of a future value of a test statistic based on the information given by observed data along with model specifications, which can serve as the reference distribution for the relevant likelihood-ratio statistics.; However, it can be computationally demanding to construct a PPC distribution based on a large number of replicates. This is especially the case when the original data are incomplete, since generation of each PPC replicate requires an involved statistical computing approach (we use a data-augmentation strategy). In practice, we propose to approximate the PPC distribution by a gamma distribution whose parameters are estimated by a combination of training data and a modest-sized sample of PPC replicates. Some simulated examples suggest that this procedure, which can reduce the computation needed to approximate the PPC distribution by a factor of 20, has satisfactory statistical properties.

Keywords/Search Tags:

Data, PPC distribution, Model, Incomplete

Related items

1	Iterative Tomographic Algorithms Of Gas Diffusion Distribution Reconstruction Based On Incomplete Projection Data
2	Attribute Weighted Three-way Clustering Model For Incomplete Data
3	Estimation of the distribution of time to first event in a composite endpoint from interval censored observations with incomplete non-fatal event status
4	Attribute Correlation Modeling And Missing Value Imputation Of Incomplete Data Based On Fuzzy Partition
5	Imbalanced-type Incomplete Data And Missing Value Imputations Based On TS Modeling
6	Research On Incomplete Data Streams In Internet Of Things
7	Processing Methods For Incomplete Information Systems Based On Rough Sets
8	Research And Application Of Incomplete Data Imputation Algorithm
9	Dual Energy CT Reconstruction From Incomplete Data
10	Research On Approaches For Dynamic Knowledge Acquisition From Incomplete Data