Font Size: a A A

Evolutionary Interpretable Regression Method Based On Genetic Programming

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:H Z ZhangFull Text:PDF
GTID:2518306776992929Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In the field of machine learning,obtaining stable and reliable interpretable machine learning models has been a coveted goal of researchers in this area.Among them,decision tree algorithms have occupied an important position due to their good interpretability and have been widely used for various types of machine learning tasks.Current mainstream decision tree algorithms mainly use univariate or linear models as decision surface models and leaf node models,but in real-world machine learning tasks,univariate and linear models may not be sufficient to describe real-world variable relationships.Therefore,it is a pressing issue to investigate whether non-linear decision surface models and non-linear leaf node models can be constructed to enhance the modelling capability of decision tree models while preserving the interpretability of current decision tree models.In this paper,the above issues are investigated and explored,and the main work includes:1.To address the problem of difficulty in fitting nonlinear local model in the current decision tree,this paper first proposes a piecewise symbolic regression tree algorithm.The algorithm uses the genetic programming algorithm to construct nonlinear features for all linear local models of a piecewise linear regression tree,and employs a dynamic partitioning mechanism to alleviate the degradation of model prediction performance due to incorrect initial partitioning.The piecewise symbolic regression tree algorithm is compared with 21 other symbolic regression and machine learning algorithms,and the experimental results show that the piecewise symbolic regression tree algorithm has the best overall prediction performance.2.To address the problem that it is difficult to construct a non-linear decision surface for the decision tree model in the current mainstream random forest algorithm,this paper proposes an evolutionary forest algorithm in an attempt to extend the genetic programming-based higher-order feature engineering techniques to the random forest domain so that each base model in the random forest has a stronger non-linear modelling capability.Taking into account the population property of genetic programming algorithms,the evolutionary forest algorithm dynamically maintains a model archive during the evolutionary search process,resulting in an ensemble model.The evolutionary forest algorithm proposed in this paper is compared with 15 other decision tree-based algorithms on 117 datasets and the experimental results show that the evolutionary forest algorithm has better results.3.To address the problem that the constant leaf nodes in each decision tree of an evolutionary forest cannot effectively portray non-linear local variations,this paper further integrates the evolutionary forest algorithm and the piecewise symbolic regression tree algorithm,and proposes the piecewise evolutionary forest algorithm.Specifically,the piecewise evolutionary forest algorithm makes use of the overall framework of the evolutionary forest algorithm,i.e.dynamically maintaining the archives during the search process to form the final ensemble model,while the piecewise evolutionary forest also borrows the idea of piecewise symbolic regression tree,using a residual-based piecewise linear regression tree instead of the traditional regression tree as the base model.The experimental results show that the piecewise evolutionary forest algorithm has significantly better overall performance than the 23 algorithms including the evolutionary forest algorithm on 122 datasets.
Keywords/Search Tags:Decision tree, random forest, symbolic regression, genetic programming, evolutionary learning
PDF Full Text Request
Related items