Font Size: a A A

Enhanced Singular Collaborative Filtering Based Recommender System On Apache Spark

Posted on:2019-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:ADEM SEID AHMEDFull Text:PDF
GTID:2428330545952169Subject:COMPUTER TECHNOLOGY
Abstract/Summary:PDF Full Text Request
In recent days,the volume of data is exploding at rates never before experienced due to the increase in the number of the Internet users and advancement of Internet technologies.This leads to information overloading which refers to the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information.Recommender Systems are tools and techniques providing suggestions of interesting items to its users.The suggestions provided by Recommender Systems are aimed at helping its users in various decision-making processes,such as what items to buy,what news article to read,what music to listen to,what video or movie to watch,what book or research paper to read or even which people to recommend to other etc.Recommender Systems have become must-own tools implemented by most giant companies.They can be found in most of the Internet-based apps we visit frequently,helping us to find the information we may be interested in while saving us from the problem of information overload.Collaborative filtering(CF)is the most widely used recommendation approach by many real-world giant E-commerce organizations,including Google,Netflix,and Amazon.These techniques use a database of ratings(user-item matrix).Memory-based CF techniques are intuitive,relatively simple to implement,and little affected by the constant addition of users,items,and ratings,which are typically observed in large commercial applications.Memory-based CF methods compute a similarity between two users by utilizing the pairs of ratings on common rated items and we call those comparative ratings as dual ratings.However,it ignore ratings observed only from either of the user and we call those non-comparative ratings as singular ratings.The major goal of this thesis is to provide ESCF(Enhanced Singular CF)recommendation approach to tackle the major challenges of Recommender Systems.In particular,four major problems were studied,including data sparsity,scalability,accuracy and cold start.To this end,we made four research contributions.1.ESCF(Enhanced Singular Collaborative Filtering),this proposed novel method utilizes the singular ratings and combines the ratings provided from similar and dissimilar users or dissimilar neighbors to maximize the utilization of available information;2.Improved performance on data sparsity and cold start conditions;3.Improved accuracy;and 4.Distributed Implementation,which is used to solve the problem of scalability and to reduce computational complexity.In order to show the effectiveness of our proposed method,two public real-world benchmark datasets,MovieLens 100k and MovieTweetings 150k were used in the experiments.The proposed method,user-based ESCF algorithm was implemented with the use of the distributed parallel computing framework Apache Spark platform and the Scala programming language.The experiments showed that the proposed method outperforms baselines such as traditional memory-based CF approaches and Singular CF method.Also,our proposed approach improved the performance of existing CF methods when data is sparse as well as there is cold start condition.
Keywords/Search Tags:Recommender Systems, Memory-based Collaborative Filtering, Apache Spark, Distributed Framework, MovieLens, MovieTweetings
PDF Full Text Request
Related items