Font Size: a A A

Research On Symbolic Regression-based Approach For Training Explicit Classifier

Posted on:2024-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:W Q ShiFull Text:PDF
GTID:2568307097456844Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The core task of pattern recognition is classification,Support vector machines,neural networks,decision tree and other classifiers have been widely used.Support vector machines need to introduce kernel functions when solving nonlinear problems,however,there have been no clear rules for the selection of kernel functions and corresponding parameters;although neural networks are known as universal approximators and widely used in pattern recognition,they are a black box and do not have good interpretability,which affects their use in some application scenarios requiring higher interpretability;as an instance-based inductive learning algorithm,decision trees are prone to overfitting in the process of classification.Therefore,designing a classification method that avoids kernel function kernel and has interpretability has become a hot research topic.Symbolic regression is a mathematical expression-based modeling method whose goal is to find a symbolic expression from the given data that can approximate the relationship between input and output within a given margin of error.Based on this,this paper designs a classifier training algorithm based on symbolic regression,which uses the syntax tree to toppologically represent the potential classifier,and uses the gene expression coding as the storage form of the syntax tree,and uses genetic operators such as cross and mutation to realize the evolution of the syntax tree,and then optimizes the final classifier.In this paper,the evaluation index of common classifier accuracy and the evaluation index of fitting label designed in this paper are analyzed and compared,and the classifier trained by fitting label method is more in line with the distribution characteristics of samples.In the case of noise,if you only focus on improving the classification accuracy and ignore the generalization performance of the model,the model is prone to overfitting,which is usually manifested as an increase in the complexity of the model and a decrease in the prediction ability of unknown samples.Therefore,this paper introduces complexity as an evaluation index for classifiers in view of the above problems.By analyzing several existing methods for calculating model complexity,this paper chooses the Rademacher complexity measurement method,takes its fitting error with the classifier model as the two targets to co-evolve the classifier,and then uses the ensemble learning method to fuse the obtained Pareto solution set to obtain a strong classifier.This paper selects some two-dimensional and higher-dimensional nonlinear classification examples,and experiments compare the proposed algorithm and classifier training methods such as support vector machine and neural network,which verifies that the proposed algorithm is interpretable without kernel function.In addition,the single-objective training method and multi-objective optimization method designed in this paper are experimentally compared in the noisy sample set,and the generalization performance of the classifier trained by the algorithm is improved after the introduction of Rademacher complexity as an evaluation index.
Keywords/Search Tags:Explicit Classifier, Symbolic Regression, Supervised Learning, Interpretability, Syntax Tree, Multi-objective Optimization
PDF Full Text Request
Related items