Font Size: a A A

The Research Of Random Forest Based On Combination Strategy

Posted on:2014-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:X D LiuFull Text:PDF
GTID:2248330398950010Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Filtering variables that are rich in information and establishing good classification models is important research content in machine learning and main task of improving classification performance. With the development of computer technology, in addition to the calculation of models’ space and complexity of time, the prediction accuracy of classifier is the main criteria in the assessment of classification model. There are two factors that affect the classification accuracy:feature set and classifier. In this paper, we focus on researching the influence of features’combinations and combination strategy on data analysis.When dealing with the gene, protein and metabolism data processing, the variable combination often has an effect on the data processing. This combination not only produces better performance and understanding but also rise people’s attention. This paper presents a Random Forest using new variable which is constructed by two original variables and retaining the new variables whose performance are better than the two original variables’ performance. After the experiment of high-dimensional and low-dimensional data sets, we can conclude that our approach is a superior method to the original random forest as a whole.In the next part, this paper does some research about the classification of tobacco samples which are picked from different regions using chromatographic techniques. Giving a hierarchical classification strategy, we transform the five-classes to two-classes and three-classes. We use SVM and Random method to select variables that show essential difference among five different kinds of samples in our research. The effectiveness of our method is validated by performance of classification models and assistant analysis of PCA and PLS-DA.
Keywords/Search Tags:Random Forest, New variables, Hierarchical Policy, Multi-class problem
PDF Full Text Request
Related items