Font Size: a A A

Research Of Data Quality Control Methods Based On Statistical Theory

Posted on:2014-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:S F WangFull Text:PDF
GTID:2248330398465575Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The data quality control methods that based on statistical theory mainly embody in thecombination of statistical theory and fault diagnosis technology. In general, fault diagnosismethods include three types which are basing on mathematical model, knowledge and datadriven. Fault diagnosis technology based on statistical theory is a significant branch of thetechnology based on data-driven, and it was developed from univariate statistical analysisto multivariate statistical analysis as the primary. As the most typical algorithm ofmultivariate statistical analysis, Principal Component Analysis relies on process data, doesnot need to build mechanism model, and has strong versatility, so it is applied widely in theareas of fault diagnosis. In the actual industial process, the variables in system are alwaystend to nonlinear, timing, instability and so on. Principal component analysis is a linear andstatic method, it has been insufficient for practical requirements. This paper gives asystematically research to fault diagnosis technology based on PCA. And to solve thevarious features of industrial process data, some correspondingly improving for the PCAalgorithm are made, some new methods for fault diagnosis are provided. The main researchachievements include:(i) Give a systematically study to the PCA algorithm and its application in faultdiagnosis. We mainly introduces the basic principles of PCA and the basic knowledge offault detection and diagnosis based on PCA, and provide basic theory for further extensionof PCA’s application in fault diagnosis of industrial processes.(ii) According to the characteristic of large samples, nolinear in the industrial processesdata, a new algorithm based on k-means clustering of Kernel PCA is proposed. With theimproved selection strategy of initial centers, initial centers can contain the spacedistribution of original data as much as possible. And then map the cluster data in featurespace. On one hand, the proposed method keeps the KPCA algorithm’s advantage onsolving nonlinear problem. On the other hand, it cut down the calculation load by reducingthe dimensions of kernel matrix.(iii) According to the mix cahracteristics of dynamic and nonlinear in the industrialprocesses data, a new dynamic KPCA algorithm is proposed. It uses the basic idea of dynamic PCA, and extractes the time series information between the data variables byintroducing autoregressive model. Analyzing the correlations between the variables todetermine the appropriate time-lag length, then through create time-lag matrix and buildKPCA model for it.
Keywords/Search Tags:Fault diagnosis, principal component analysis, k-means clustering, kernel PCA, autoregressive model
PDF Full Text Request
Related items