Font Size: a A A

Key Technologies For Recommender Systems Based On Rating And Text Data Mining

Posted on:2020-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B XuFull Text:PDF
GTID:1368330575981200Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Recommender systems is a useful data-mining technology to process the data and create more business value.In general,traditional recommender systems utilize the known information,such as ratings,reviews,and attributes of users and items,to recommend valuable new items to potential users and make profits for the e-commerce companies.The methodology of traditional recommender systems is always based on either matrix factorization models or collaborative filtering models,combined with side information to achieve a good performance.However,with the development of new technology(like deep learning),the drawbacks of traditional recommender systems(old-fashioned methodologies and simple evaluation metrics)affect the combination with deep learning.Otherwise,with the boosting of hardware and mobile networks,how to collect data is not the bottleneck of the recommender systems.Instead,how to filter the data,improve the quality of data and drop the fraud data is the key to maximize the power of recommender systems,which is also a pot of research right now.Along with this line,it is a meaningful and significant research to utilize data mining methodology to benefit recommender systems,cooperating with deep learning.With the new technology trends and new data sources,there are some challenges for recommender systems.:1: The combination with deep learning: the development of deep learning has benefited some traditional methodologies,such as text mining and image process.To take advantage of deep learning,a meaningful embedding model is a necessary foundation.Embedding model is to map the original data to a latent space,in which their relationships can be measured.Then deep learning models input the embedding results and make predictions.A proper embedding model usually leads to an accurate and efficient deep learning model.However,there are no embedding models designed for traditional recommender systems.Moreover,the sparsity of datasets in recommender systems also increases the difficulty of embedding.So,it is a challenge to utilize strong data processing abilities of deep learning models to benefit recommender systems' ' efficiency.2: Fraud data and unbalanced dataset: recommender systems can utilize multisource multi-view data with the development of different technologies,which breaks the bottleneck of recommender systems in recent decades.Many existing recommender systems use multi-source multi-view data and achieve a satisfying result.However,they always ignore the effect of the quality of data and the fraud data issue.In the real world,especially in e-commerce websites,the fraud ratings and reviews confuse the recommender system,leading to an inaccurate and biased recommendation.How to utilize data mining technology to tackle the data,filter the fraud data and drop the fake data is also a research spot for improving the accuracy of the recommender systems.3: The evaluation of recommender systems: the evaluation of recommender system is always based on the following metrics: precision,recall,hit ratio,and novelty,etc.However,these metrics have been used for decades,which cannot evaluate the recommender systems' performance in recent situations comprehensively.For example,a recommender system with extremely high accuracy may recommend the same items repeatedly for users,and lose its ability to create new value(sell new things).The metrics of evaluation should be extended to face the complicated and domain-special scenarios in the era of information explosion.A useful metric should positively correlate to the existing ones,and evaluate recommender system from valuable sights.To this end,this paper proposes several solutions as follows:1: To make recommender systems get a better description about relationships between users and items,this paper proposes a metric-based learning embedding model,Latent Dual Metric Embedding(LDME).Existing recommender systems use matrix factorization with inner products to represent the relationships between users and reviews,then make Top-k recommendation through collaborative filtering models.We treat the result of matrix factorization as a particular matrix embedding.Moreover,we notice that there are some limitations to using inner products in matrix embedding models.LDME utilizes Euclidean distances instead of inner products in the objective function,then uses dual triplets <user,user,item>,<item,item,user> instead of <user,item> as the inputs to balance the data.Meanwhile,LDME also learns the latent relationships between users and items hidden in the user-item matrix,which increases the accuracy of embedding.Finally,LDME employs a loss function with Pull and Push sub-functions to lower the overfitting.Extensive experiment validates our proposed model and demonstrates that LDME can improve the efficiency of recommender systems2: To filter the fraud data and tackle the unbalanced data,this paper proposed a CNN based fraud data detection and user true-opinion analysis model,Neural-network based Opinion mining model,(NeuO).NeuO uses a modified CNN to analyze the review data,build Text to Score module to calculate the users' opinion scores.Then we also design a Combination Function to tackle the users' opinion scores and users' real ratings.After Combination Function,NeuO achieves users' opinion bias and filter the fraud reviews,users,even professional fraud users,which exactly improves the quality of data.We evaluate NeuO on several e-commerce datasets,and the exciting results show that NeuO can improve the recommender systems' accuracy and efficiency.3: To improve the evaluation metrics,this paper proposes a novel metric for recommender systems: Serendipity,and proposed a serendipity-based CF model,Neural Serendipity Recomendation(NSR).Traditional metrics focus on the accuracy of recommendations,which depends on the users' explicit preferences.However,serendipity items reflect users' potential preferences,which means they have low explicit attraction but high potential satisfaction.NSR abstracts the explicit interests and satisfaction from the user-item matrix,then filter the serendipity candidate items to make a proper recommendation.Then experiments demonstrate that serendipity correlates positively to existing metrics,while NSR can greatly improve the quality of recommendations.
Keywords/Search Tags:Data Mining, Deep Learning, Recommender System Models
PDF Full Text Request
Related items