Font Size: a A A

The Algorithm To Improve Feature Selection In Personalized Recommender Systems For Items In The Tail Of The Long Tail Theory

Posted on:2017-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z D DingFull Text:PDF
GTID:2308330482479361Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Accompanied by the rapid expansion of the E-commerce, shopping websites have produced vast amounts of commodity information, which leads to the problem that it’s more and more difficult to find the target commodity(especially the long tail commodity) for consumers. Personalized recommender system looks like a powerful way for solving this problem. And there is no doubt that a better recommender system will bring huge economic effect. Features can describe users’preference and character. And they are generated or extracted from the data set. Feature selection has important influence in the performance of recommender systems. Nowadays, lots of researches are focus on the improvement of the algorithms and models implemented in recommender systems. There are few of papers about the feature selection.This paper introduces some algorithms used in recommender systems and the classifier ensemble. Then we propose a fusion algorithm that can be used to improve feature selection. Also, this paper introduces a new classifier ensemble scheme. The main work of this paper is as follows:(1) Analyzing three algorithms adopted by recommender system-Logistic Regression(LR), Gradient Boosting Regression Tree(GBRT), and FunkSVD. FunkSVD is a matrix factorization algorithm. This paper introduces the theory and implementation of each algorithm, and their application scenarios. Also, the complexity and efficiency, advantages and disadvantages of each model are important parts. Then, we analyze the mechanism of filtering features of LR, and the excellent performance in classification of GBRT. We also analyze the possibility to combine LR and GBRT, and its ability in improving the performance of recommender systems.(2) Bring forward a fusion algorithm based on LR and GBRT model. When selecting features through LR, the algorithm distinguishes features in two dimensions. And it results of generating two feature sets, one is obtained from the positive data set and the other is obtained from the negative data set. In the process of feature selection, the fusion algorithm applies different parameter LR models to divide features in different levels of granularity. In GBRT model training, this algorithm selects training features from the two dimensions feature set in the same proportion. Besides, the fusion algorithm also defines the sequence of the training features.(3) Proposing a new classifier ensemble scheme. This scheme is that fusing the results of the fusion algorithm in the paper and FunkSVD. After z-score normalization, the results will be reordered to produce a new list.(4) using movielens dataset and Tmall dataset as the experimental data sets, F1 metric as the measurement. Compared to LR and GBRT, the fusion algorithm has a higher F1 value, presenting a better performance. The new classifier ensemble scheme also has a better performance.
Keywords/Search Tags:personalized recommender system, Logistic Regression, GBRT, classifier ensemble
PDF Full Text Request
Related items