Research On Symbolic Regression Generalization Performance Based On Multi-objective Optimization

Posted on:2022-08-26

Degree:Master

Type:Thesis

Country:China

Candidate:W T Sheng

Full Text:PDF

GTID:2518306512471844

Subject:Systems Engineering

Abstract/Summary:

PDF Full Text Request

Symbolic regression,as an important method for data-driven modeling,can simultaneously discover the structure of explicit mathematical expressions and calculate parameters with high fitting accuracy.In data-driven modeling,the pursuit of high fitting accuracy often results in overfitting of the model,which usually manifests itself in increased model complexity and reduced predictive ability for unknown samples,i.e.,poor generalization performance of the model.Therefore,the study of balancing model accuracy and generalization ability is of great interest.According to Occam's razor rule,it is known that the simpler the model is,the more likely it is to approach the laws implied by the data,and the stronger the generalization ability of the model.Based on the rule,this thesis proposes an improved gene expression programming algorithm to improve the generalization ability of the model.This thesis addressing the problem that the existing symbolic regression methods only focus on the model fitting ability but ignore the generalization ability of the model.In this thesis,we analyze various existing model complexity measures,and select one method of machine learning to characterize the complexity of hypothesis space:Rademacher complexity,and introduce it into symbolic regression as the complexity measure of this algorithm.In this thesis,the training error and the model complexity were optimized as two objectives,and the Pareto solution set was fused with the model by ensemble learning method to obtain the final output model.In order to ensure the diversity of the population in the intelligent algorithm for solving the symbolic regression problem,an adaptive variation operator based on information entropy is designed in this thesis to replace the traditional variation operator.The operator acts on each bit of the individual code string in the population,calculates its information entropy based on the probability of occurrence of different symbols on each bit of the code string,and then determines the variation rate of each bit,and performs variation operations on the symbols of all individuals regarding each bit separately,so as to maintain the diversity of the population and enhance the local superiority-seeking ability of the algorithm.Finally,this thesis selects some training functions in classical literature and conducts experimental comparisons between this thesis's algorithm and several methods used respectively to verify the effectiveness of this thesis's method.

Keywords/Search Tags:

Dymbolic regression, Structural risk minimization, Model complexity, Multi-objective optimizion, Adaptive variational operat

PDF Full Text Request

Related items

1	Researching On Regression Test Selection Technique For Object-Oriented Software-Test Suite Minimization
2	Structural Risk Optimization Methods Based On Posterior Preference
3	Extensions to fuzzy ARTMAP based on structural risk minimization
4	Structural Risk Minimization Principle On Rough Space
5	Control of smart structural systems using multi-objective optimization techniques
6	The Study On The Sparse Structured LSTSVR Algorithms
7	A Study On The Predicting Model Of Gray Neutral Network And Support Vector Machine
8	Image Compression Sensing Reconstruction Based On Multiscale Variational Algorithms And Depth Convolution Neural Network
9	Fuzzy Support Vector Machine
10	Medical B-Ultrasound Image Processing Based On Variational PDEs