Studies Of Variable Selection And Transformation Method

Posted on:2009-06-06

Degree:Doctor

Type:Dissertation

Country:China

Candidate:K L Tang

Full Text:PDF

GTID:1101360242978387

Subject:Inorganic Chemistry

Abstract/Summary:

PDF Full Text Request

In the 21 century,the boomed data of chemistry and biology,quickly developed equipments and analysis technologies,help us obtain more information about structures and functions.How to obtain valuable knowledge is a large challenge to life science research.In order to resolve this problem,we should improve algorithms or propose new algorithms.Curse of dimensionality is one of the most difficult problems in large scale data analysis.New methods and solutions are proposed.Variable selection and variable transformation are used to resolve this problem.This main study of dissertation is the study of new variable selection and variable transformation methods.First,research background,the concepts and achievements are briefly introduced. A brief description of the QSAR principle,realization process and research status are given.Dimensional reduction methods of huge data are introduced,including variable selection and variable transformation.Kernel method is described in detailsThen,methods of variable selection and variable transformation are proposed, including kernel method,statistical moment's transformation method and pattern variables method.Kernel functions are used successfully in machine learning etc.In previous studies,different variable selection methods obtain different results.In order to avoid this condition,Kernel partial least squares is used in this study.The relationships of original variables are replaced by the relationships of samples.Satisfied results are obtained.Statistical moments are used to transform variables.The data are divided into several intervals.The statistical moments of each interval are used as new variables. The number of variables is decreased.The classification results are improved.The above two methods use full and local information of the data,though the contributions of variables are not considered.Then the method of pattern variables is proposed.In this method,continues variabls are transformed into pattern variables. The number of variables is further decreased.The specific patterns of cancer and normal are extracted respectively.These methods are applied in some real case.In diagnosis of ovarian cancer and leukemia,good results are obtained.The retention times of peptide are predicted by three variables(sum of retention time of amino acids,Van der Waals volum and n-octanol-water partition coefficient). The results of KPLS are superior to those of linear method.KPLS is used to predict the retentiontime of dioxins.Two kinds of molecular modeling methods are used to predict the behavior of dioxins.KPLS are super than PLS in both modeling and predicting.QSAR models based on the results of molecular docking are constructed.The distances of inhibitor and active sites of NA are apllied as variables in QSAR.

Keywords/Search Tags:

variable transformation, kernel, QSAR, statistical moments, pattern vectors

PDF Full Text Request

Related items

1	Multivariate Statistical Regression Methods Based On Kernel Function
2	The New Applications Of Tchebichef Image Moments In Analysis
3	The Application And Research Of Kernel Transformation Used In Food Test By Electronic Nose
4	Molecular Fragments Variable Connectivity Index And Its Application In QSAR Study
5	Statistical analysis of granular gases, pattern formation, and crumpling through real space imaging
6	Study On Reductive Transformation Of Chloronitrobenzenes By Zero-valent Iron And QSAR
7	A Novel Variable Selection Method And The Application In QSAR Studies Of The Environmental Endocrine Disrupting Effect
8	Application Of PLS And GA On QSAR Of Selected Organic Pollutants
9	Enhancement And Identification Of Mongolian Furniture Pattern Based On AGC-Quantile
10	Process Monitoring Method Of Reheating Furnace Based On Nonlinear Multivariate Statistical Theory