Font Size: a A A

Application Study On Kernel Machine Learning For Modeling And Classifying

Posted on:2007-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y G FanFull Text:PDF
GTID:1118360182990576Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The appeal of kernel-based methods has been arisen in recent years by the success of support vector machines (SVM). The idea of kernel-based methods is that any standard linear technique, if applied in an high dimensional feature space resulted from a nonlinear map of the original input space, may be transformed to its nonlinear version in the input space very simply by means of the kernel trick, provided that operations between the components of the algorithm are limited to dot products. In this dissertation, some kernel-based algorithms of classification, prediction and fault detection are proposed. The algorithms are to find rules from data, and applied to predict the concentration of 4-carboxy-benzaldehyde(4-CBA) in a practical purified terephthalic acid(PTA) oxidation process, fault detection in Tennessee-Eastman (TE) process and face recognition problem. The main contributions are described as follows,(1) A novel clustering algorithm—relative grid clustering algorithm is proposed. The proposed algorithm divides each dimension in data space into a number of intervals to form a grid structure. Then move these grids to form a new grid structure. The processes of division and moving are executed repeatedly to form many grids. Each grid is a hypercube. The samples in the same grid form a transaction set. Then use Apriorii algorithm to find frequent item. Those frequent items, which have same data form one cluster. The proposed algorithm can resume the inherent relations which can be brown down in the fixed-scale grid cluster algorithm. Experiments show that the algorithm can run well in the dataset with noisy, shape-arbitrary and overlapping boundaries clusters.(2) A new method to construct and fine-tune fuzzy model is propoed. The new method controls the number of the fuzzy rules by reducing base vector or selecting new one in the feature space F. Firstly, Input data space is divided into some subspaces by fuzzy c-means clustering method (FCM). The centers of the clusters are selected as the original base vector set. Find the redundant base vectors and reduce them from the set. Then, form the approximate expression for each trainingdata with base vectors and calculate the approximate error. Add the data with the largest error into the base vector set. The process of base vector selection is executed repeatedly until all the approximate error is less than a constant determined by user beforehand. Finally, determine the parameters of the corresponding rule premise using the base vectors, and construct the fuzzy model by Least square method. The experiments of Iris classification and Mackey-Glass chaotic time series prediction prove the effectiveness of our proposed method.(3) Two-step scheme for kernel fisher discriminant analysis (KFDA) is proposed. Firstly, form an approximate explicit expression via the Nystrom method for nonlinear mapping which is not available originally. Finally, map the data in the nonlinear input space into some linear feature subspace and compute fisher linear discriminant analysis (FLDA) for the mapped data there.(4) Kernel principal component analysis (KPCA) based on subset for fault monitoring is proposed. The KPCA is computed using a kernel matrix K, whose dimension is equivalent to the number of trained samples. Clearly this is problematic, especially, for large data sets. In order to solve the problem, a sample subset that has the similar distribution to the original training data set is selected by maximizing the quadratic Renyi entropy. KPCA based on the subset is used to monitor the Tennessee Eastman process. The experiments show that the results of the proposed method are almost same compared to the KPCA based on the all samples. But selected samples are only a small part of the trained sample sets. Then the computational problem is solved by reducing the dimensions of matrix K.(5) A novel fault detection method based on kernel principal angle (KPA) is proposed. In the past, numerous statistical process monitoring methods based on principal component analysis (PCA) have been developed and applied to various chemical processes for fault detection and identification. Moving principal component analysis (MPCA) is one of the improved statistical process monitoring methods that based on PCA. Change in the subspace spanned by some selected principal components is monitored for fault detection in MPCA. However, PCA-based monitoring methods are linear techniques and have been proved inefficient and problematic for nonlinear systems. The fault detection method based on kernel principal angle is efficient for nonlinear system. Constructing feature subspace and computing the kernel principal angel are two main parts in the proposed method. That is, the basic idea of the KPA-based detection method is similar to that of MPCA. The performance of the proposed fault detection method iscompared with the conventional multivariate statistical process control (cMSPC) and MPCA in the application to simulated data obtained from the Tennessee Eastman (TE) process. The results clearly show that the performance of the KPA-based fault detection method is considerably better than that of the other.(6) A time series forecasting method based on dynamic weighted least squares support vector machine (LS-SVM) is proposed. Dynamic weighted LS-SVM is suitable for system recognition and time series prediction because the algorithm can track the dynamics of nonlinear time-varying system. The weights are determined by robust method in order to reduce the effect of the noise data. Dynamic weighted LS-SVM is applied to predict the concentration of 4-Carboxybenzaldchydc (4-CBA) in purified terephthalic acid (PTA) oxidation process. Results indicate that the proposed method reduces the effect of outliers and yields high accuracy.
Keywords/Search Tags:Least Squares Support Vector Machine, Kernel Principal Component Analysis, Kernel Fisher Discriminant Analysis, Fault Detection, Fuzzy Model, Clustering
PDF Full Text Request
Related items