Font Size: a A A

Regression based variable clustering for data reduction

Posted on:2001-02-17Degree:Ph.DType:Dissertation
University:University of WashingtonCandidate:McClelland, Robyn LeaghFull Text:PDF
GTID:1468390014454663Subject:Statistics
Abstract/Summary:
This research was motivated by a problem from the Cardiovascular Health Study (CHS). Specifically, 3647 CHS participants had MRI scans, from which the location of MRI-detected strokes were recorded in terms of a 23 region atlas of the brain. The goal of the study was to assess the magnitude of association between strokes in these locations and various responses such as measures of cognitive function, depression, or incident events. A difficulty with these data is that there are a large number of regions, and many with sparse data. To simplify presentation and improve estimation we decided to combine the regions to form a reduced atlas of the brain.;To address this problem I have developed a clustering algorithm tailored to the features of the MRI data. Regions are combined based on their association with the response, rather than with each other. The algorithm can adjust for potential confounding variables during the clustering process. The statistical properties and performance of the algorithm were evaluated via simulation studies. We considered whether the algorithm successfully captured the true underlying structure, whether good parameter estimates were obtained, and whether there was a benefit to clustering over simply using all the regions individually. Various noise levels (standard deviation of the error distribution) and sample sizes were studied. Although designed with the MRI data in mind, the algorithm can be used in a much wider range of problems. Applications which consider a range of possible scenarios are also presented.
Keywords/Search Tags:Data, Clustering, MRI
Related items