Font Size: a A A

Imbalanced Rating Prediction In Recommender System Via Gumbel Distribution And Dual Optimizer

Posted on:2022-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WuFull Text:PDF
GTID:2518306527998429Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Rating prediction is a core problem in the recommendation system,which is used to quantify users' preferences for different commodities.Due to the imbalance of the rating distribution in the training data,the existing recommendation models usually produce biased predictions.Therefore,their performance in predicting long-tail samples is generally unsatisfactory.To solve the above problem,this paper proposed two methods.The first model calls TADO(Time-varying Attention with Dual-Optimizer Model).This method's details show in Chapter 3.TADO focused on three problems in the review-based recommendation model.First,reducing the prediction loss of the traditional models in the rare rating.To this end,TADO proposed a flexible dual optimizer model to obtain better model performance from regression optimization and classification optimization.Second,the traditional review-based recommendation system models used word vectors to encode text information.However,word vectors cannot understand the profound semantic expressions in text review sentences.Because the same word has different semantics in different sentences,however,the word vector has a fixed expression for a fixed word.In order to solve the problem of insufficient context information caused by word embedding,TADO first introduced BERT(Bidirectional Encoder Representations from Transformers)into the review-based method to improve the performance of semantic expression.Third,the existing methods ignore the user's time-varying preference feature.Therefore,TADO proposed a time-varying feature extraction module with bidirectional long short-term memory and multi-scale convolutional neural network.Finally,the traditional models only focused on using text reviews to model user preferences and item characteristics while ignoring the result of the interaction between the two.Thus,the interaction of the user and item vector only occurs in the last layer.To solve this problem,TADO introduced an additional interactive layer and residual connections.The interactive layer retained the characteristics of users and items,as well as the features of their interaction.Inspired by the excellent ability of extreme value distribution(EVD)to model the distribution of rare data,the second method in this paper proposed a novel Gumbelbased Rating Prediction model GRP(A Gumbel-based Rating Prediction).The model is a flexible framework that can accurately predict the rare and frequent ratings simultaneously.In GRP,a different Gumbel distribution is first defined for each scoring level,obtained through historical scoring statistics of users and items.Secondly,GRP's multi-scale convolutional fusion layer combined Gumbel-based representation of users and items with the original representation learned from the rating matrix/or reviews to enrich the user and product representation.Third,GRP proposed a data-driven rating prediction module that used highly compressed feature vectors and original ratio features to predict users' ratings of items.TADO conducted extensive experiments on 23 benchmark datasets of Amazon Product Reviews.Compared with the several recent methods,TADO obtained significantly higher performance than ALFM,MPCN and ANR on average,20.98%,9.84%,and 15.46%.Further ablation experiments demonstrated the contribution of each component of TADO to the performance of the final model.GRP conducted experiments on 8 data sets of Amazon Product Reviews and compared seven benchmark models,namely: PMF,Neu MF,Deep Co NN,ANR,NARRE,NRPA,TDAR,using popular recommendation system indicators: MAE,F1,HR,NDCG.Extensive experimental results show that: 1)GRP achieved the highest overall performance on all eight data sets;2)GRP made substantial progress in predicting rare ratings,which shows the effectiveness in solving biased prediction in an imbalanced dataset.
Keywords/Search Tags:Recommender Systems, Natural Language Processing, imbalanced datasets, Gumbel distribution, Collaborative Filtering
PDF Full Text Request
Related items