Font Size: a A A

Application Research Of Used-car Recommendation Based On Classification Method On Imbalanced Data Sets

Posted on:2017-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:H B QiuFull Text:PDF
GTID:2348330536468164Subject:Engineering
Abstract/Summary:PDF Full Text Request
Used-car recommendation can be solved as class imbalanced problems which have attract much attention from machine learning areas.Recently,in class imbalanced problems,researchers have made some significant progress,and applied the progress in such areas as text mining,credit scoring,spam filtering and so on.Based on the SVM and Bayesian Network,this thesis investigates used-car recommendation with unbalanced data,with the focus on unbalanced data sets classification and in particular the reconstruction of the training data set and optimizing classification algorithms:(1)By an analysis of the characteristics and deficiency of the SMOTE over-sampling method,we propose the Synthetic Minority Over-sampling Technique Filter,or SmoteFilter for short.It works by balancing positive & negative data in preprocessing,and then using the SVM & Bayesian Network to build the prediction model.The experimental study shows that our method has better effect on predicting accuracy of minority class than the SMOTE method,improving the positive class prediction accuracy of vehicle recommendation.(2)Based on SmoteFilter,we improve sample generation for each iteration of Adaboost and propose the SmoteFilterBoost ensemble approach.With a small modification to the SmoteFilter and coming into Adaboost,we improve the training samples of each base classifier during Adaboost.The emphasis is given to improve the classification accuracy of minoritiy class for the base classifier generated by each iteration,and to avoid possible overfitting of the minority of Adaboost ensemble method,which,as a result,enhances the overall classification performance.In this way,we solve the classification of unbalanced data in vehicle recommendation from the perspective of classifier algorithm optimization.With the SVM as the base classifier,our experimental study demonstrates that the SmoteFilterBoost ensemble approach effecitvely solves the classification deviation problem in used-car recommendation with unbalanced data,improves the fitting degree of minority data as well as the prediction accuracy of used-car recommendation.Finally,a case study corroborates the effectiveness of our emsemble method for used-cars recommendation.
Keywords/Search Tags:Used Car Recommendation, Classification, Imbalanced Dataset, Oversampling, Ensemble Learning, Spport Vector Machine, Bayes Classsifiers, WEKA
PDF Full Text Request
Related items