Font Size: a A A

Kernel Based Learning Algorithm And Application

Posted on:2013-01-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:L JianFull Text:PDF
GTID:1118330371996656Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Kernel trick is a powerful tool for solving nonlinear problems, kernel based learning theory and algorithm are research focuses in machine learning field. This thesis mainly focuses on the design of kernel based learning algorithm and its application in blast furnace ironmaking process and protein identification problems.The main studies in design of kernel based learning algorithms lie in:propose a novel binary coding SVM algorithm which takes a N-classes classification task as multiple binary classification problem and only requires [log2N]binary classifiers, greatly lower than the con-ventional one-against-one method (?)(N2) and one-against-all method (?)(N); formulate the is-sue of multiple kernel learning(MKL) for LS-SVM as a semidefinite programming to get the global optimal solution, furthermore, optimize the regularization parameter with the kernel co-efficients in a unified framework, which leads to an automatic process for model selection, the computational complexity of LS-SVM MKL reduces greatly compared with that of SVM MKL but sharing evenly matched precision, which makes LS-SVM MKL be suitable for dealing with large scale data sets, and perform extensive validation experiments.As one application problem, this paper studies the prediction and trend classification mod-els of temperature in blast furnace(BF) ironmaking progress. Focus on the silicon content in hot metal([Si]), a chief indicator of the furnace temperature, this thesis explores the nonlinear approximation ability of SVM and constructs data-based models for [Si] prediction includes: incorporate the sliding windows schematic into smooth support vector regression and construct the sliding windows smooth support vector regression(SW-SSVR) model, which can update learning samples and track the state change of the studied system in time, the SW-SSVR model is employed to address the [Si] prediction problem, which exhibits good performance with high percentage of successful trend prediction, competitive computational speed and timely online service; through the proposed binary coding SVM algorithm, a four-class problem, i.e., sharp descent, slight descent, sharp ascent and slight ascent of [Si], is reduced into two binary classifi-cation problems to solve, to heel, the four-class classification results can guide the blast furnace operators to determine the blast furnace control span together with the control direction in ad- vance; aiming at the prediction problem of [Si] change trend, MKL is employed to integrate heterogeneous data which improves the prediction accuracy, further more MKL is utilized to do feature reduction which is quite helpful for increasing the comprehensibility on explaining which variable is important for black box modeling.Peptide identification by tandem mass spectrometry(MS/MS) is another application is-sue of this thesis. Proteomics has become a hot subject in the post-genomic era. Peptide identification by MS/MS is widely used for high-throughput identification of proteins in com-plex biological samples. A flexible algorithm based on MKL SVM, named De-Noise, is pro-posed to transform the peptide identification problem into a special binary classification prob-lem. The De-Noise algorithm starts with the pre-process in which some of the noisy target PSM are eliminated from the target PSM dataset to provide more reliable training dataset. The noisy PSM are determined by computing their distance to the centroid of decoy PSM. Once the noisy target PSM are discarded from the original target PSM dataset in the data pre-process step, two rounds of refining processes are taken to distinguish the correct PSM from the incorrect PSM. At last, proteolytic information is integrated for validating PSM.We test the De-Noise algorithm on three data sets from multiple mass spectrometry platforms, Yeast(LCQ),UPS1(LTQ),Ta108(Orbit) and compared it with PeptideProphet and percolator. The performance of the De-Noise algorithm is shown to be superior on all data sets searched on sensitivity and spectificity. Thus, the De-Noise algorithm could be able to validate the database search results effectively.
Keywords/Search Tags:Kernel Learning Algorithm, Multiple Kernel Learning, Blast FurnaceIronmaking Process, Tandem Mass Spectrometry, Peptide Identification
PDF Full Text Request
Related items