Font Size: a A A

Research On The Improvement Of Weighted Slope One Algorithm For Sparse Data

Posted on:2021-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:J M LiuFull Text:PDF
GTID:2518306119970739Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology,more and more data are generated.How to find the information of interest quickly and accurately becomes more and more difficult,and the recommendation system can effectively solve this problem.However,the continuous development of recommendation system also face many problems,such as data sparsity problem,cold start problem and so on.This paper is mainly based on weighted slope one algorithm.The weighted slope one algorithm is a kind of collaborative filtering algorithm,which mainly relies on the difference in item rating value and adopts a linear method for prediction.This method is simple in calculation,but the effect is poor when the data is sparse.Therefore,aiming at the problems existing in slope one algorithm and the problem of data sparsity,this paper studies as follows:(1)The weighted slope one algorithm only considers the influence of the number of rating users on the results,and does not consider the internal relationship between the user or the project itself.Therefore,this paper considers the impact of user similarity on the results.Besides,because the traditional similarity calculation methods such as Pearson correlation coefficient,cosine similarity and so on,mainly rely on the common rating when calculating the similarity.When the common rating is few or even no,the results obtained by using the traditional similarity calculation method is poor.Therefore,this paper proposes a weighted slope one algorithm combining Bhattacharyya coefficient.Two aspects are improved: One is to improve the user similarity by using the Bhattacharyya coefficient.Firstly,the Bhattacharyya coefficient is used to analyze the user correlation and calculate the global similarity,and the final user similarity is obtained by weighted fusion with the traditional similarity calculation method.The second is to optimize the prediction rating formula by using the Bhattacharyya coefficient,calculate the similarity of the project with the Bhattacharyya coefficient and use it as the weight factor to optimize the prediction rating formula.(2)For data sparsity in weighted slope one algorithms,the common solution is matrix filling.Currently,most fillings are done by means of the data average,median and mode.Although the problem of sparse data can be alleviated,these filling methods do not take into account the characteristics of user and project itself,and do not consider that user ratings are susceptible to subjectivity and other factors such as environment,resulting in inaccurate user ratings of project.Since the item attributes are fixed,users' preferences for items can be indirectly reflected according to their preferences for item attributes.Therefore,this paper proposes a new rating matrix filling method.First,the user's preference value for item attributes is calculated,then the user's average rating is combined,and finally the rating matrix is filled.Based on the filled rating matrix,considering that the user's interest will change with time,the time factor variable is introduced,so a weighted slope one algorithm combining user's attribute preference feature and time factor is proposed in chapter four.The algorithm in this paper is based on the weighted slope one algorithm.To verify the proposed BCWSOA and FTWSOA,by experimenting with Movie Lens data sets,the results show that the two algorithms of BCWSOA and FTWSOA have better effect on both MAE and RMSE than other algorithms.
Keywords/Search Tags:weighted slope one algorithm, sparsity, Bhattacharyya coefficient, matrix filling, time factor
PDF Full Text Request
Related items