| In recent years, as e-commerce develops, more and more people begin go shopping through internet. Users ?nd what they want by searching. On the other hand, the e-commerce system ?nd out users’ preference by analysis their access log and then do recommendation for them. As recommendation can ?nd users’ potential interest and helps a lot for improving the sales, the e-commerce systems pay much attention to improving their recommender systems.In 2014, the world’s largest e-commerce company Alibaba launched a big data competition. The competition is based on the real access data of users in Tmall. Its aim is to ?nd users’ preference by analysis users’ access log in last 4 month and do recommendation for them. This work is based on this competition. We researched how to do personal recommendation for users according to users’ access log when we know nothing about the property of users and brands.We ?rst try the most popular technology collaborative ?ltering in recommender system. We mix users’ multiple access log and build a rating table. Then we use collaborative ?ltering compute the similarity of brands and do recommendation according to the similarity. As there are too many brands in Tmall, most users accessed only a few brands in the past. So the rating table is very sparse. A sparse rating table make collaborative ?ltering lose its accuracy in computing the similarity of brands and also the accuracy of recommendation. To avoid the problem, we propose two regression based methods to build a recommender system. The two methods are logistic regression and gradient boost regression tree. We introduce how to build a recommender system using logistic regression and gradient boost regression tree in detail including data preprocessing, noise eliminate, feature extraction, feature selection, smooth processing and normalization. Logistic regression is a linear regression in essential while gradient boost regression tree is a tree based regression. A tree based regression is often better than linear regression in dividing space. At last, we introduce two model ensemble methods to ensemble the three models. Model ensemble improving the effect of recommendation by mixing multiple recommender models.Our experiment shows that, logistic regression and gradient boost regression tree perform much better than collaborative ?ltering when there is large amount of data.And through model ensemble, we can get another improvement. |