Research On High-dimensional Sparse Data In Collaborative Filtering Algorithm

Posted on:2020-10-02

Degree:Master

Type:Thesis

Country:China

Candidate:H M Lu

Full Text:PDF

GTID:2518306104496134

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The recommendation system is the product of addressing individual needs.As the most widely used algorithm in recommendation algorithms�collaborative filtering,has great research value.High-dimensional sparse data in collaborative filtering algorithms leads to deviations in similarity calculation and score prediction,and the efficiency of selecting neighbors is low,which seriously affects the quality of the algorithm.At present,the research on this problem has the following shortcomings: In the study of improving the similarity calculation,the difference of the score is not considered,the degree of user's like and dislike of the item attributes is not measured,the implicit interest is not mined;Ignoring the high-dimensional characteristics of the data,and choosing the nearest neighbor in the entire data set,resulting in algorithm has longer running time;In the research of improving the prediction of the score,the difference in similarity between users on different items is not considered.This paper starts with the understanding of the principles of collaborative filtering algorithms and the research of related theories.It will be carried out from the following three aspects:(1)Not only relying on the scoring matrix,but adding additional information to enrich the similarity calculation.Introducing information entropy to measure the amount of information included in the score difference,and combining the score differences to get the similarity of the score differences;Using fuzzy sets to obfuscate a single score,measuring the user's like and dislike in item attributes to obtain explicit interest similarity;Adding matrix factorization to mine users' implicit interest to obtain implicit interest similarity.The above three similarities are combined with the original modified cosine similarity to obtain a comprehensive similarity and alleviate the problem of sparse data;(2)The user is clustered by the improved K-Means algorithm that optimizes the initial centroid selection,and the nearest neighbors are selected in the target cluster to improve the algorithm's operating efficiency and alleviate the problem of high-dimensional data;(3)Considering the difference in similarity between users on different projects,it proposes that the trust degree based on a specific project is fused with the comprehensive similarity to obtain the similarity based on a specific project for score prediction and alleviate the problem of sparse data.The algorithm in this paper runs on the classic MovieLens data set.Through comparison and analysis with other groups of algorithms,it is found that compared with similar algorithms,the average absolute error of this algorithm is lower,which improves the recommendation accuracy.The improved K-Means algorithm has less running time,which improves the running efficiency.The algorithm in this paper alleviates the problem of high-dimensional sparse data to a certain extent and improves the quality of the algorithm.Finally,the algorithm in this paper is applied to movie recommendation to verify that the algorithm in this paper is effective and feasible in practice.

Keywords/Search Tags:

Collaborative filtering, Similarity, Scoring difference, Interest Believability

PDF Full Text Request

Related items

1	The Research And Implementation Of Collaborative Filtering Recommendation Based On Genres And Scoring Matrix Filling
2	Collaborative Filtering Recommendation Algorithm Based On User Feedback And Its Timeliness Improvement
3	Research On Collaborative Filtering Recommendation Algorithm Based On User Rating And Interest
4	Recommendation Algorithm And System Implementation Based On Collaborative Filtering
5	Collaborative Filtering Algorithm Based On Best Similarity Weight And Localized Preference
6	Research On And Application Of Similarity Calculation And Users’ Interest Drifting Of Collaborative Filtering Algorithm
7	Research On Collaborative Filtering Algorithm Based On Similarity Calculation In Recommendation System
8	Research On Collaborative Filtering Recommendation Algorithm Based On Information Entropy
9	Research On Collaborative Filtering Recommendation Algorithm Based On User Interest Similarity And Trust Mechanism
10	Collaborative Multi-interest Preference Filtering Recommendation Research Based On User Rating Penalty Factors