Font Size: a A A

Statistical properties of statistical matchin

Posted on:2002-04-26Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:Moriarity, Christopher LFull Text:PDF
GTID:1468390014451726Subject:Statistics
Abstract/Summary:
Statistical matching is a procedure that merges microdata from sample surveys into a single synthetic microdata file. The goal of this procedure is to create a file that allows multivariate analyses to be done on the merged set of variables, even though they were not collected together.;A typical scenario for statistical matching is that data on a vector of variables (X, Y) are collected in Survey A, and data on a vector of variables (X, Z) are collected in Survey B. Statistical matching develops a synthetic microdata file from Survey A and Survey B, usually matching on some function of the common vector of variables X, to produce a file with values of X, Y, and Z on each record.;In general, it is not possible to accurately construct the (X, Y, Z) distribution using the distribution of (X, Y) from one source and the distribution of (X, Z) from another source; what is lacking is information about the distribution of ( Y, Z). Typically, little or no auxiliary information about the ( Y, Z) distribution is available a priori.;One possible approach is to allow a variety of assumptions to be made about the distribution of (Y, Z), carry out statistical matching to create a dataset corresponding to each assumption, and then assess the variation in estimates made from the group of datasets created by this procedure. This approach would exhibit the amount of uncertainty in estimates due to the statistical matching procedure.;Kadane (1978) and Rubin (1986) both discussed using such an approach, and outlined procedures to do so.;The focus of this dissertation is to evaluate and extend Kadane's and Rubin's methodologies. In carrying out this task, we provide important details of Kadane's and Rubin's procedures that were not provided in their descriptions and we provide corrections for the mistakes we discovered. We also derive simplifications of several formulas in the existing descriptions of the procedures.;Perhaps most importantly, we show that the procedures described by Kadane and Rubin are not feasible, as originally stated. We develop innovations of both procedures that achieve the desirable results promised initially. These innovations are implemented in SAS software.
Keywords/Search Tags:Statistical, Procedure, File, Survey
Related items