Font Size: a A A

Research On Web Log Mining And Collaborative Filtering Algorithm

Posted on:2012-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:N B LiFull Text:PDF
GTID:2218330338471616Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
It was only couple of decades since the emergence of Internet, but as the quickly growing of Internet and developing of techniques, Internet has impacted everyone's everyday life enormously. Comparing the traditional website which only offer the users text and multimedia, Web 2.0 developing during late 20th century and 21st century not only brings new technology but only on the purpose of increasing users' experience on the website. Under such an endeavor, researches among increasing users' experiences, optimizing the structure of website and personalizing user's visiting become more and more important.In order to optimize Beijing Language and Culture University's web structure, to construct a website suitable for users and to personalize users' information services by mining the logs of web servers, this article serves as the first step. After thoroughly considering the current situation of Beijing Language and Culture University's web log, this article does corresponding optimization and improvement in doing web mining. Also this article depicts the details of every single step of pre-processing of web log mining, analysis different pre-processing algorithm involving it. This article also implements a completed web mining tools, as a solid foundation for future web log mining procedures.Besides that, this article also analyses two traditional collaborative filtering algorithms, which are user-based collaborative filtering algorithm and item-based collaborative filtering algorithm. By comparing the advantages and disadvantages of these two algorithms with different way to calculate similarity, this article introduces a new method called mixed-based collaborative filtering algorithm, and depicts the advantages of the new method to the traditional two. After implementation and tested with the data of MovieLens, the result proves that this new mixed-based collaborative filtering algorithm not only has the advantages of the traditional two methods but also overtake some disadvantages of them. The results also show that on different measurements like MAE and ratio of recall, the new method has improvements.
Keywords/Search Tags:Web log mining, pre-processing, collaborative filtering, recommendation system
PDF Full Text Request
Related items