Decision Trees And Deep Forests With Eliminating Random Consistency

Posted on:2022-05-27

Degree:Master

Type:Thesis

Country:China

Candidate:C H Wang

Full Text:PDF

GTID:2518306509970259

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Information entropy and mutual information,as one of the important indicators of feature evaluation in classification,play a key guiding role in the construction of classic classifiers such as decision trees and deep learning.As we all know,in the process of feature decision-making,there is a problem of multi-value bias(bias to select attributes with more values)in information gain.The information gain ratio is proposed based on the improvement of the algorithm's existing defects.Even if the information gain ratio can improve the problem of multi-value bias,sometimes the index will be biased to select features with fewer values.In the learning process,random guesses with lack of logic due to lack of detailed knowledge may form random consistency with the actual situation,and random consistency may reduce the generalization ability of the learning system and affect the fairness and objectiveness of the decision results.Therefore,research to eliminate random consistency in the learning process has become one of the most important topics at present.In order to solve the above-mentioned problems,this paper carries out research on the method of eliminating random consistency classification.The main content includes the following two aspects:(1)Aiming at the problem of multi-value bias in the process of using information entropy as the feature decision criterion,the adjusted mutual information is studied,and the AID3(Adjusted Iterative Dichotomiser 3)algorithm is proposed.By using the adjusted mutual information as the decision-making index in the classification process,the random consistency in the decision-making process is eliminated,the problem of multi-value bias of information gain is solved,and the feature selection is better.The AID3 algorithm proposed on this basis effectively alleviates the random consistency phenomenon in the classification process and improves the accuracy of the classification.Experimental results show that the elimination of random consistent mutual information can effectively solve the problem of multi-value bias of information gain compared with traditional information entropy,and has a higher average classification accuracy.(2)Aiming at the imbalance problem of deep forest in the classification task of small-scale data sets,the deep forest that eliminates random consistency is studied,and the Agc Forest(Adjusted gcForest)model is proposed.By adjusting the random forest in the deep forest cascade layer,the random consistency of the deep forest model in the classification is effectively alleviated,and the effective transmission of the model in the learning process is improved and optimized.The experimental results show that,compared with the original standard deep forest model,the AgcForest model proposed in this paper successfully eliminates random consistency in the classification task of small-scale data sets,avoids the impact of category values on the experimental classification process,and improves the generality of the model.The fairness of the classification results.In this paper,aiming at the problems in the classification task,the method of eliminating random consistency is studied,and its feasibility is verified on the data set through experiments.This paper studies machine learning classification tasks from a new perspective,which has certain research value for improving the generalization ability of machine learning models and expanding the application of machine learning in the field of artificial intelligence.

Keywords/Search Tags:

Random consistency, Information entropy, Decision tree, Classification, Deep forest

PDF Full Text Request

Related items

1	Parallel Ordinal Decision Tree And Decision Forest Based On MapReduce
2	The Design And Research Of Deep Cascade Fuzzy Decision Forest
3	A Classification System For Network Violation Information Based On Machine Learning
4	A Study On Improvement And Applications Of Random Forest Classification Algorithm
5	Wearable Sensor Activity Recognition Based On Deep Forest Research
6	Research On Code Plagiarism Detection Model Based On Random Forest And Gradient Boosting Decision Tree
7	Random Forest Based On Attributes Combination
8	Application Of Decision Tree Algorithm In Student Performance Prediction
9	Research On Decision Tree Algorithm Based On Rough Sets And Ensemble Learning
10	The Research On Application Of Improved ID3 Of Decision Tree Classification Algorithm In Management Of Students' Grades