Boosting And Its Application In Discriminant Analysis Of Microarray Data

Posted on:2007-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:C F Fu

Full Text:PDF

GTID:2144360185979185

Subject:Epidemiology and Health Statistics

Abstract/Summary:

PDF Full Text Request

With the rapid development of Microarray technologies base on high throughput screening, masses of data emerge and challenge the statisticians because of the feature of "small sample with high dimensionality ". The Boosting algorithm, as one of the ensemble methods, fascinates many researchers with its nearly "perfect" classification capacity.In this research, we first introduced the idea of Boosting, described two fundamental procedures, AdaBoost and LogitBoost. Based on the two procedures, we constructed discriminant models of simulation data and traditional data. Comparisons of the predictive effects of Boosting, Bagging, Random-Forest, Fisher's Linear Discrimination, Fisher's Quadratic Discrimination and Logistic Discrimination were also discussed.With much care to the specificity of Microarray data, we analyzed two public databases: leukaemia and breast cancer data. The idea is as follows: (1 ) Use the FDR procedure to correct the P-Value, screen the gene variable with a criteria of Pâ‰¤0.05 or Pâ‰¤0.01 so as to make the dimensionality less than the sample size. Construct the discriminant model and compare Boosting with other two ensemble methods and three traditional methods; (2)Construct different discrimnant models with different sets of gene predictive variables based on the order of P-Value, and distinguish the advantages of Boosting(including precision and sensitivity). (3) Identify the advantages of Boosting by comparing it with principal component discriminant analysis. Predictive effects of the above methods should be confirmed by cross-validation to ensure the stability of the results.

Keywords/Search Tags:

Boosting, AdaBoost, LogitBoost, Discriminant Analysis, Microarray data, Prediction, Cross-validation

PDF Full Text Request

Related items

1	The Associated Prediction Of MicroRNA-Disease Based On Complicated Correlation Network
2	Application Study Of Gene Expression Data On Diagnosis Of Tumor And Prediction Of Gene Function
3	Based On Discriminant Analysis Of Medical Data Processing
4	A Study On Disease Prediction Model Based On Small Sample Medical Data And Its Privacy Preserving Technologies
5	Optimal Quantile Level Selection Method And Its Application To ECG Data For Disease Classification
6	Application Research Of Discriminant And Cluster Method In Clinical Acupuncture Data Analysis
7	Modeling And Analysis Of Auxiliary Diagnosis Of Intestinal Diseases In Children Based On Clinical Data
8	Research On Blood Pressure Detection Method Based On Big Data Analysis
9	Prediction Of Multiple Drug Target Interactions Based On Microarray Data Analysis
10	The Establishment And Validation Of The Discriminant Analysis Model For Bone Metastases In Newly Diagnosed Prostate Cancer Patients