Font Size: a A A

Research On Gene Expression Programming Ensembles Algorithm

Posted on:2012-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J DongFull Text:PDF
GTID:1228330467468350Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Gene Expression Programming (GEP), proposed by Candida Ferreira, is a new evolutionary algorithm based on genotype and phenotype. GEP inherits the following advantages, simple-coding of traditional Genetic Algorithm (GA), solutions of complex problems provided by Genetic Programming (GP). It uses linear code with fixed length to represent the individuals whose chromosome is simple, linear and compact, so GEP is quite efficient. In the experiments, Candida Ferreira found GEP was2to4orders of magnitude more efficient than GP. Thus, the efficiency of GEP has been greatly improved compared with GP. Scholars of domestic and foreign are attracted by GEP after it was proposed for its high performance and make it become a powerful tool for automatic programming and have applied it to the following domains, such as function regression, time series prediction, data mining, resource allocation, circuit optimization and so on. Although GEP has been applied in many fields successfully, its learning ability is not strong, and it is prone to overfitting when applying it to massive data sets with noise. So far, the research achievements about the problem are rare at home and abroad. Based on this situation, referencing the neural network Ensembles, the theory of GEP evolution modeling is proposed in this dissertation. The main contents and innovations of this dissertation are summarized as follows:(1) GEP Ensembles algorithm for modeling is proposed. Referencing the neural network Ensembles, it proposed an algorithm based on GEP Ensembles for modeling. The core idea is that using GEP to train limited number of GEP models from the same data set, and then the output of GEP Ensembles is determined by these GEP models which consist of the GEP Ensembles. Experiments demonstrated that GEP Ensembles algorithm can solve the over-fitting problems of traditional GEP to some extent and the error of EGEP is less than traditional GEP, especially to deal with the problems of function regression in data sets with noises.(2) The generalization error formula of GEP Ensembles is proposed. Based on presenting the mathematical model of the GEP Ensembles, the generalization error formula of GEP Ensembles is proposed to calculate the generalization error of GEP Ensembles quantitatively. It also investigated how largely the number of GEP models affects the performance of GEP Ensembles by numerical experiment. Experiments indicated that the more trained GEP models, the performance of GEP Ensembles are higher in general.(3) It is proved that the generalization error of GEP Ensembles is less than the weighted average error of traditional GEP, when the generalization difference of each GEP model consisting of GEP Ensembles is larger than0. And then, it analysed how largely the weight of each GEP model affects the performance of GEP Ensembles by numerical experiment. Experiments validated that the performance of GEP Ensembles with average-weight of each GEP model is more stable than random-weight of each GEP model in general.(4) Parallel GEP Ensembles algorithm based on thermodynamical migration strategy is proposed. GEP models are quickly trained by Parallel GEP algorithm based on thermodynamical migration strategy, and then integrate. According to the current evolution state, individuals are migrated dynamically and conditionally using the thermodynamical migration strategy. It avoids the blind and fixed migration of traditional Parallel GEP algorithm. In addition, it can harmonize the conflict between the Receiving Pressure and the diversity of subpopulation to some degree. Therefore, it improves the migration efficiency and convergence speed. Experiments shown that Parallel GEP Ensembles algorithm based on thermodynamical migration strategy can obtain higher speed-up than the traditional Parallel GEP Ensembles algorithm.(5) Parallel GEP Ensembles based on Graphics Processing Unit (GPU) is proposed. General computing based on Graphics Processing Unit is a research focus in recent years and it has been successfully applied in many fields with high speed. Based on this research tendency, GEP Ensembles algorithm is implemented on GPU naturally, improving the speed of GEP Ensembles and also exerting its potential effectively. Experiments indicated that Parallel GEP Ensembles algorithm based on Graphics Processing Unit is faster than traditional Parallel GEP Ensembles algorithm on CPU.
Keywords/Search Tags:Gene Expression Programming, Gene Expression ProgrammingEnsembles, Overfitting, Thermodynamical Migration Strategy, Parallel Algorithm, Graphics Processing Unit (GPU)
PDF Full Text Request
Related items