Font Size: a A A

Research Of Chinese Word Segmentation Based On The Optimization Of Genetic Algorithm

Posted on:2013-01-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:J HeFull Text:PDF
GTID:1118330374486912Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Genetic algorithm (GA for short) is an intelligent optimization algorithm. It has theadvantages of good versatility, fast search speed and strong ability of global search. Butit also has some weak point such as poor local search ability and the disadvantage of"premature" that it often mistakes the local optimal solutions for the global optimalsolution. A single algorithm exposed various problems when solving practical problems,many researchers proposed the idea of hybrid algorithm that combines two or morealgorithms together to avoid those weaknesses.This thesis mainly studies Genetic algorithm. And we will get new algorithms thatcombine it with Particle Swarm Optimization (PSO) algorithm and Biological immunemechanism respectively. Then new algorithms are applied into the area of Chinese wordsegmentation.The main contributions of this thesis are as follows:1. For the disadvantages of vulnerable local extreme value for Particle SwarmOptimization algorithm, a new improved Particle Swarm Optimization algorithmcombined with the idea of GA is proposed. We called it new Genetic Algorithm-Particle Swarm Optimization (GA-PSO for short). During the running stage of PSO, thealgorithm can judge if the particle trapping into local optimum, and then the GA-PSOalgorithm introduced the cross and mutation operation of GA to avoid the localoptimum and approach to the global optimum. Then GA-PSO algorithm is used as alearning algorithm in a BP neural network. The experimental results show that whencompared with the classical PSO algorithm in training neural network handle the XORclassification problem, GA-PSO algorithm has the advantage of high accuracy, shortrunning time and less number of iterations.Also, the improved BP algorithm based on new GA-PSO algorithm could avoidtrapping into local optimum, long learning time and slow convergence than classical BPalgorithm. When compared with BP algorithm based on PSO, GA-PSO algorithm hasfast convergence speed, short search time and less possibility to fall into local optimum. Then we applied the new algorithm to ambiguity processing in Chinese wordsegmentation to verify the effectiveness of the algorithm.2. On the study of classical PSO algorithm, we found it has some weakness. Forexample, to increase the number of particles will result in computational complexityincreased, while weakened particles chase about the global benefits makes the algorithmis not easy to convergence. For this weakness, a particle swarm optimization algorithmbased on Fuzzy C-means clustering (FPSO) is proposed.In the FPSO algorithm, the current particles group is firstly divided into multisub-population by FCM. Then, the current particles group is updated by the personalbest particle and the global best particles in the sub-populations.The algorithm runs through fuzzy clustering the particle swarm. Information to beexchanged between the particles, with more particles on an iterative optimizationprocess contains information, the algorithm get better global convergence.The experiment and the application of ambiguity processing in Chinese wordsegmentation demonstrates that proposed algorithm is superior to BP algorithm, it canreduce the number of iterations, improve the precision of convergence and itsgeneralization performance is superior to the traditional BP algorithm.3. We study Immune mechanisms on the basis of GA, the main research objects areImmune Genetic Algorithm (IGA for short) based on vaccination. The core ofvaccination based IGA is reasonably extracting vaccines and timely vaccination. A newvaccines extraction algorithm using fuzzy clustering algorithm to extract similar genesof best antibodies as vaccines is proposed in this thesis.Immune selection mechanism is used in the vaccination process so that toguarantee the antibody group is not degraded due to the vaccine. Then, the clusterselection method is used in antibody group update progress to keep best antibodies andmake differences between antibodies as much as possible to avoid prematureconvergence.A new strategy of cultivating elite antibody is designed for BP network training.The strategy defines elite antibodies with high fitness and significant difference.Using this strategy, elites are extracted in the later stage of each iteration process inimproved Vaccination–based Immune Genetic Algorithm (VIGA) and trained by thesteepest descent method to make it to the extreme point one step closer. We analysis the advantage of the evolution neural network and the neural networksegmentation method, the improved VIGA is applied to ambiguity processing inChinese word segmentation model based on neural networks.
Keywords/Search Tags:Genetic Algorithm (GA), Particle Swarm Optimization (PSO) Algorithm, Immune Genetic Algorithm (IGA), Neural Networks, Chinese wordsegmentation
PDF Full Text Request
Related items