Font Size: a A A

Clustering for chemical process monitoring and air quality analysis

Posted on:2007-11-10Degree:Ph.DType:Dissertation
University:University of California, DavisCandidate:Beaver, Scott MichaelFull Text:PDF
GTID:1441390005479151Subject:Engineering
Abstract/Summary:
Cluster analysis is a class of unsupervised multivariate statistics in which groups of related objects are categorized using a sequence of statistical decisions. Objects placed into the same cluster are relatively similar, while the clusters themselves represent distinct patterns in the data set. Clustering algorithms are difficult to implement, suffering from low accuracy or poor interpretability. Also, clustering algorithms are intended for independent observations, limiting their practicality for real world applications. This dissertation develops theory to enhance the performance and interpretability of existing clustering algorithms, and extends cluster analysis to the time series domain.;Clustering algorithms are dichotomized as hierarchical or nonhierarchical, and both classes of algorithms have inherent strengths and limitations. A hybrid clustering scheme is proposed which retains the advantages of both classes of clustering algorithms while overcoming several key limitations. Such hierarchical aggregation of nonhierarchical cluster solutions enhances the performance and interpretability of cluster analysis.;A nonhierarchical time series clustering algorithm is proposed for use in conjunction with the hierarchical aggregation scheme. The algorithm uses Principal Components Analysis (PCA) models as the cluster prototypes, and it groups observations having similar PCA represent adores. The method can identify events at multiple time scales. Clusterings at different time scales can be logically combined to provide a global cluster solution describing all events represented in the historical data. A method is presented to eliminate the effects of a confounding cycle.;The methods are applied to both air quality and chemical process data. Cluster analysis of pollutant composition measurements identifies similar ozone episodes in the San Francisco, CA Bay Area. These clusters are related to meteorological conditions, and several ozone buildup mechanisms can be inferred by comparing the clusters. Wind field data from the same region are clustered to identify groups of days sharing similar diurnal cycles for mesoscale flow. These patterns relate to local air quality and the synoptic meteorological state of the atmosphere. A final study applies the methods to chemical process data obtained from a pilot plant. All known operating regimes are isolated from the historical data, and previously unknown events are detected as well.
Keywords/Search Tags:Cluster, Air quality, Chemical process, Data
Related items