A Study On Process Industrial Data Mining Based On Support Vector Machines

Posted on:2006-12-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Zhang

Full Text:PDF

GTID:1118360152996424

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

In this dissertation, several issues and the corresponding solutions about data mining technology based on support vector machines (SVM) are discussed. Based on SVM, some algorithms of data mining are proposed. Then the proposed algorithms are applied to a practical industry process of PX. The main contributions are described as follows,(1) A new incremental SVM learning algorithm (FS-SVM) is proposed. The training samples and incremental samples will influence each other when incremental samples are added into the current working set. In FS-SVM, support vectors are selected as much as possible into current working set to increase the predicted accuracy. The simulated result on UCI Adult data sets indicates that the proposed algorithm can efficiently increase the accuracy and speed.(2) In order to overcome model failure problem, a soft sensor modeling method based on incremental SVM (ISVM) is presented. In ISVM, an incremental sample which represents new operational condition is introduced to model, at the same time, an old sample is discarded from the model to control the size of working set. The proposed method is applied to predict the purity of PX in a PX fractionation by adsorption process. Simulation results indicate that the proposed soft sensor model actually increases the adaptive abilities to various operation conditions and solves the model failure problem caused by change of operation conditions or load.(3) In order to overcome the overfitting problem caused by the fixed penalty factor, fuzzy support vector regression (FSVR) and fuzzy least squares support vector machines(FLS-SVM) are proposed to deal with the problem. Strategies based on k nearest neighbor (&NN) and support vector data description (SVDD) are adopted to set the fuzzy membership values of data points. The proposed FSVR and FLS-SVM algorithms based on kNN and SVDD are applied to predict the concentration of 4-carboxy-benzaldehyde (4-CBA) in a practical purified terephthalic acid (PTA) oxidation process. Simulation results indicate that the proposed method actually reduces the effect of outliers and yields higher accuracy.(4) SVM is applied to many research fields because of its good generalization ability and solid theoretical foundation. However, as the model generated by SVM islike a black box, it is difficult for user to interpret and understand how the model makes its decision. A hyperrectangle rules extraction (HRE) algorithm is proposed to extract rules from trained SVM. Support vector clustering (SVC) algorithm is used to find the prototypes of each class, then hyperrectangles are constructed according to the prototypes and the support vectors under some heuristic conditions. When the hyperrectangles are projected onto coordinate axes, the if-then rules are obtained. Experimental results indicate that HRE algorithm can extract rules efficiently from trained SVM and the number and support of obtained rules can be easily controlled according to a user-defined minimal support threshold.(5) A novel data mining method is introduced to solve the multi-objective optimization problems of process industry. A hyperrectangle association rule mining (HARM) algorithm based on support vector machines is proposed. Hyperrectangles rules are constructed on the base of prototypes and support vectors under some heuristic limitations. The proposed algorithm is applied to a simulated moving bed (SMB) paraxylene adsorption process. The relationships between the key process variables and some objective variables such as purity, recovery rate of PX are obtained. Using existing domain knowledge about PX adsorption process, most of the obtained association rules can be explained.(6) In order to simplify the process of data mining, a data mining "5P" model of process industry is presented and a data mining system software ESP-PIDMS is written. Using the ESP-PIDMS, some data mining models are built to solve real industrial problems.

Keywords/Search Tags:

process industry, data mining, support vector machines, para-xylene, pure terephthalic acid

PDF Full Text Request

Related items

1	A Study On Process Data Mining Technique Based On Support Vector Machines
2	Dynamic Simulation And Control Of PTA Hydrogenation Reaction
3	Research On Model And Information Transform Mechanism Of Process Support Vector Machines
4	Application Of Support Vector Machines In Data Mining
5	Research Of Data Mining Techniques Based On Support Vector Machines
6	Research On Improvement Of Online Support Vector Machine And Its Application
7	An Application Of Data Mining Technology In Polymerization Process
8	Cost Sensitive Data Mining Based On Support Vector Machines: Theories And Applications
9	Studies And Application Of Fuzzy And Double Regular Support Vector Machines
10	CRM System Based Freight Forwarding Industry Research And Application Of Decision Support