Font Size: a A A

Research And Application On Model Blending Algorithm

Posted on:2017-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2348330485981331Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Data is accumulating and increasing with the rapid development of internet.The period of the analysis of big data has come.In order to improve the precision of data analysis and then provide better services for all kinds of occupations,how to research on data's diversities and features to gain more valuable information is a hot issue.Machine learning is a good way to analyze big data and ensemble learning is a branch of it to be an effective strategy to improve the precision of data analysis.The main idea of ensemble learning is to make fusion on several base classifiers by appropriate method to enhance the generalization ability and robustness of the model,while the common model blending method is the voting method based on statistics theory.Ensemble learning mainly focus on two aspects.One of it is selective ensemble learning,the other is model blending methods.Selective ensemble learning focuses on how to choose and use a part of base classifiers to achieve better performance than all classifiers.Model blending method focuses on the method of base classifier fusion to get better robustness and generalization ability compared to single base classifier.This thesis focuses on the research about model blending method and how to improve the precision and robustness of the model.The major contents of the research work are as follows.(1)Describe the background and significance of ensemble learning in machine learning,and introduce the present research status and hotspot at home and abroad.Summarize the common ensemble learning algorithms and its advantages.Demonstrate the key thoughts of ensemble learning and the logistic regression.(2)According to model blending theory,this thesis proposes a two-level model blending algorithm based on logistic regression which is short for TMBLR.The TMBLR algorithm use logistic regression to fit the data including the prediction results in the first layer.And then,the precision and stability of the prediction are improved efficiently.(3)Do some experiments and analyses on the TMBLR algorithm.On the one hand,in order to verify the precision and stability of the algorithm,this thesis makes several comparison experiments between the single classifier algorithm and model blending algorithm based on voting.On the other hand,for the purpose of checking the high efficiency of logistic regression,which is used for training at the second level of the TMBLR algorithm,some other algorithms are used for comparison.The results of the experiments show that the model blending algorithm has higher robustness and precision.The F1-score of the TMBLR algorithm is 2.05 percent higher than the single base classifier.Comparing with the model blending algorithm based on voting,the TMBLR algorithm has higher average Fl-score with 1.1 percent and lower mean square deviation value to indicate higher stability with 7.2‰.The logistic regression algorithm achieve the best performance on precision and Fl-socre and time efficiency,when using different algorithms on the second-level blend in TMBLR algorithm.
Keywords/Search Tags:Big data, Machine Learning, Ensemble learning, Model blending, Logistic Regression(LR)
PDF Full Text Request
Related items