Font Size: a A A

Analysis of process control baseline data using data mining

Posted on:2008-10-24Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Zhang, HangFull Text:PDF
GTID:1448390005952220Subject:Engineering
Abstract/Summary:
There are two phases in multivariate statistical process control (MSPC). In phase I, we model baseline data off-line to characterize the process. Baseline data is a collection of observations describing successful manufacturing. In phase II, we compare on-line observations to these models to determine whether the process is in control. This dissertation addresses four questions to improve phase I analysis: (1) How many operational modes are in baseline data? (2) In a large historical dataset collected over a long time period, which periods are the baseline? (3) In profile baseline data, are there outlier profiles? (4) When should the phase I model be updated?;Each operational mode appears as a cluster in baseline data. To address the first question, we propose a new method to determine the number of clusters with all of the following critical features: it determines if there is only one cluster, the most common case; it identifies convex or non-convex clusters; and it is insensitive to user-specified parameters. No existing method has them all. Simulations show that the proposed method works well.;We propose a new method to address the second question, where historical data may be collected during both baseline and unsuccessful periods. The identified baseline periods are reasonably long, and have the best product quality with a stable distribution. Through simulated and real datasets, the proposed method shows its robustness to various distributions, in contrast to the existing change point identification method that is very sensitive to the distribution.;We address the third question in the context of complex profiles. We treat complex profiles as high-dimension vectors. We apply the chi 2 control chart to identify outliers. Applied to simulated and real datasets, it demonstrates better performance on complex profiles than the existing nonlinear regression method.;We address the fourth question by testing whether the correlation matrix changes from the baseline. The correlation matrix describes relationships among variables. We propose a new method to diagnose the responsible variables when the change is indicated.;We also discuss the future work of applying MSPC and data mining technologies on data from a brain neural system.;Key words. Statistical Process Control, Data Mining, Phase I, Operational Mode, Profile, Outlier Detection, Number of Clusters, Correlation Matrix.
Keywords/Search Tags:Data, Process control, Phase, Correlation matrix, Method
Related items