Font Size: a A A

Optimizations Of Support Vector Machine And Its Application

Posted on:2020-06-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Z LiFull Text:PDF
GTID:1368330602455046Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of Intermet technology and the continuous progress of society in recent years,machine learning,an artificial intelligence science,plays an increasingly important role in social production,scientific research and daily life.As a classical algorithm in machine learning,Support Vector Machine(SVM)has developed rapidly based on its unique advantages in small samples,non-linearity and high-dimensional pattern recognition.At present,domestic and foreign scholars have made useful explorations and studies on SVM,which have been successfully applied to various fields of production and life,including bioinformatics,text recognition and weather prediction.However,no model can perform best in all cases.Single SVM still has its limitations,such as poor performance with missing values,Iack of specific criteria for determining parameters,and poor performance in the face of complex data.These problems will adversely affect the effectiveness of the model.Based on this background,this paper improves the SVM and proposes several optimization models based on SVM.The optimization forms can be divided into three categories:data structure optimization,parameter optimization and combination optimization.For the data structure optimization model,this paper uses decomposition and ensemble strategy to select suitable data to train SVM to improve the poor effect of SVM on complex data.For the parameter optimization model,this paper proposes a model based on optimization algorithm to deal with the problem of SVM parameter selection.Combinatorial optimization model can be divided into method combination optimization and model combination optimization.Method combination optimization is a combination of the results of statistical methods including SVM,which can improve the situation that a single method can not perform best in all situations.For model combination optimization,this paper tries to combine SVM with cure model to solve the problem that traditional cure model has poor estimation effect under non-linear conditions.In order to test the effectiveness of the optimization method,the optimization model is applied to the actual data for experiments.The principle of selection and optimization is related to the specific application,and the optimization model is constructed according to the specific characteristics.In this paper,the optimization model is applied to air pollution control,cultural industry management,energy economy and survival analysis.For air pollution control,because the time series of pollutants contains sub-series of different periods,such as seasonal fluctuations and short-term weather changes,which make the data structure quite complex.At this time,data structure optimization is considered.First,the sub-sequences of different periods are separated and then predicted separately.For the cultural industry management,because a movie is affected by many factors,such as production cost,movie type,star influence,etc.,each movie has its own unique characteristics.At this time,we need to consider to use parameters optimization so that SVM with better parameters is selected.In energy economy,the sample size of data are large,and the data characteristics of different regions are different.No single model can perform best in all regions,so we should consider to use the method of combinatorial optimization model.For survival analysis in model combination optimization,the cure rate is partly Logistic regression,but with the development of research,the relationship between many covariates and cure probability does not conform to the form of Logistic function,and there are some other complex relationships.Therefore,we consider using model combination optimization to construct a new cure model.The full text is divided into six parts,the main contents of which are as follows:Chapter 1 introduces the topic basis,significance,main contents,main innovations and shortcomings of this paper.Chapter 2 discusses the background of SVM and the current research status of optimization SVM.Chapter 3 analyses the data structure optimization and application of SVM.The principle and characteristics of KZ filter and the improved method are given,and the optimization model combined with SVM is introduced.In the experiment,the research background of air pollution control is firstly analyzed.The improved KZ filter is used to analyze the pollution data of Dalian,and the prediction effect of the optimization model is evaluated by selecting the pollution data of four cities in China.The results show that the decomposition and ensemble strategy can optimize the data structure very well.The long-term component of pollutants reaches its peak in winter,while the long-term component in summer remains relatively low.The seasonal component and the short-term component fluctuate greatly in winter.From the results of variance contribution rate,we can see that seasonal components contribute most to the original sequence,followed by short-term and long-term components.The prediction results show that the data structure optimization model has good prediction effect and fitting accuracy,and still performs well in the presence of noise.Chapter 4 discusses the parameter optimization and application of SVM.Firstly,Imperialistic Competition Algorithm and SVM optimized by the algorithm are introduced.Then,the effect of the optimization model is tested by box office data.In the experiment,the most suitable training set size is selected first,and then the optimization model is applied to the box office prediction of the opening week,and compared with the commonly used model.The results show that when the optimal training set is 20 and the predictive model is the parameter optimization model proposed,the predictive effect is better than other comparative models,and the predicted MAPE value is about 15%.By listing the box office predictions and real values of 22 test films,it is found that in most cases the predictions are very close to the real values.The comparison results of the models also prove the effectiveness of the optimization model.Chapter 5 describes the combinatorial optimization and application of SVM.Firstly,the principles and characteristics of combination forecasting and cure models are discussed.Then,the effects of two combinatorial optimization models are demonstrated respectively through the application of method combination optimization model in energy economy and model combination optimization in survival analysis.The results of combinatorial optimization show that when the training set is one month data and the test set is one week data,the prediction performance is the best and the most stable.Compared with common models,SVM has the same prediction accuracy as ARIMA and BPNN.Therefore,three models are introduced to construct the combinatorial optimization model.The prediction results show that the performance of the combined optimization model is superior to any single method in the combined optimization model and to some of the prediction models proposed by scholars in recent years.The numerical simulation results of model combination optimization show that the proposed semi-parametric model has better performance than the existing cure model in estimating the uncured probability of covariates.When the potential incidence structure can not be approximated by Logistic model,the mean square error and error classification rate of the proposed cure model are less than those of the existing models,which shows that the proposed optimization model has better performance in calibration and discrimination of the incidence part.The real data results show that the latency estimates of the two models are similar,and the optimal model estimates the cure rate can provide more information than the traditional model.Chapter 6 discusses the applicability of the optimization methods,summarizes the whole paper and looks forward to the future research directions.The main innovations of this paper are as follows:(1)In the aspect of data structure optimization,the traditional KZ filter loses part of the first and last data after each filtering due to the function of moving average,and missing data is very important for building prediction model.Based on this,this paper improves KZ filtering,proposes two new filtering methods,and optimizes the data structure using decomposition and ensemble strategy and SVM.(2)In terms of parameter optimization,this paper is the first attempt to combine Baidu Index and SVM to construct a composite prediction model.Because the Baidu index of different movies varies greatly,this paper also chooses parameter optimization method to optimize the parameters of SVM.(3)In the aspect of method combination optimization,because the data structure of wind speed varies greatly in different time and regions,no model can perform well in all cases.Therefore,this paper does not study a single model with good effect,but tries to use combination forecasting method to compare the commonly used statistical forecasting models including SVM.Methods with better accuracy is chosen to build the method combination optimization.(4)In the aspect of model combination optimization,this paper combines SVM and cure model to get a new cure model for the first time.In this model,because of the unique advantages of SVM in small samples and non-linear pattern recognition,it can still have a high recognition effect under the non-logistic function of cure rate.The optimization model proposed in this paper has strong theoretical and practical significance.In theory,this paper chooses data structure optimization,parameter optimization and combination optimization to overcome the shortcomings of single SVM,simplify the training data structure and improve the model fitting effect.In addition,the optimization model proposed in this paper can theoretically make up for the shortcomings of existing models and has strong generalization ability.At the same time,these optimization models are also of great practical significance.For the more flexible machine learning model,the future development trend of the research object can be determined in advance based on the results;for the traditional statistical model,the effect of covariates can be identified and the individual with a set of covariates can be predicted according to the results.These results can provide decision-making basis for management departments and policy makers.In this paper,these optimization models are applied to air pollution prediction,movie box office prediction,wind speed prediction and survival analysis to prove their effectiveness in their respective fields.The shortcomings of this paper are as follows:(1)For the prediction model,because the selected data are within a certain range of data,there may be sampling bias.We can try to further test the comprehensive performance of the proposed model under a larger data set and compare it with other latest prediction models.(2)For the cure model,this study did not improve the latency part.In theory,the characteristics of SVM model can fit the survival function of patients in the latency part,and may achieve better results than proportional hazards model or accelerated failure time model.In addition,we can test the performance of the new cure model in high-dimensional situations in future research.
Keywords/Search Tags:Support Vector Machine, Data Structure Optimization, Parameter Optimization, Combinatorial Optimization
PDF Full Text Request
Related items