Font Size: a A A

Improvement And Application Of Ran- Dom Forest Algorithm In Recommender Systems

Posted on:2017-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2308330482981775Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays, recommender system has become a popular technique closely related with com-puter science and data mining. There are a lot of advantages using random forest algorithm rather than some other linear classifier. It has better accuracy of prediction, smaller generalized error and more efficient of processing high dimensional data. It also can be realized by parallelization. These factors make research on random forest optimization very valuable.But random forest use pure random strategy when selecting features, which could decrease the strength of the model while weakening the correlation of data. And under the circumstances of imbalanced classification, which is the number of data in one or more classes is far smaller than the number in other classes, the generalized error will increase.This paper uses the data of recommender system to research on classic random forest and some data processing methods. Based on random forest RC, using chi-square the compute the correlation of features, and sorting these features to divide them into two intervals, and then sam-pling randomly by using the linear combinations of features to complish the feature selection. Researching on imbalanced classification, combining balanced RF method and weighted RF method, coming up with balanced weighted RF, which improves the algorithm by resampling and cost sensitive learning. And experimenting on the improvement of feature selection and im-balanced classification problem, and using F1 measurement to evaluate the results, in order to prove the optimization result of random forest.
Keywords/Search Tags:recommender systems, random forest, feature selection, imbalanced data classification
PDF Full Text Request
Related items