Study On Semi-supervised Recommendation Method Based On Co-training

Posted on:2022-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:X K Sang

Full Text:PDF

GTID:2518306563976769

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the context of the rapid development of Internet technology,the problem of information overload has become a new worry for users and hinders the continued development of Internet services.With the advancement of data mining technology,a personalized recommendation system has emerged.The recommendation system can discover the user's potential preferences in a vast array of data and find information that interests users.Collaborative filtering is the core technology for building a personalized recommendation system.In recent years,it has received extensive attention from industry and academia.However,the problem of data sparsity issue has always restricted its performance.In order to alleviate the problem of data sparseness,most of the existing works focus on the introduction of side-information,while few works focus on a large amount of cheap unlabeled data.This article focuses on the common data sparse problem in recommendation systems,and will try to think from two novel angles,and introduce unlabeled samples to alleviate the data sparsity issue.This article focuses on the data sparseness problem of traditional recommendation algorithms,and makes the following research: We propose a semi-supervised ensemble filtering method to improve the recommendation performance by assembling three popular CF techniques in a co-training framework.Concretely,SSEF first initializes three weak predictors with labeled examples by three different CF algorithms independently.Two predictors generated by neighborhood methods are then merged,along with the remaining one generated by latent factor model,serve as two base recommenders,each of which labels the unlabeled examples for the other recommender during the co-training process.To exploit unlabeled data safely,the labeling confidence is estimated by validating the influence of the pseudo-labeled examples on the labeled ones.The final prediction is made by blending the outputs from the three predictors enhanced with unlabeled data.More experiments are conducted to verify the effectiveness of the proposed scheme by comparing to a variety of CF techniques,including semi-upervised,ensemble and sideinformation based solutions.We tackle the data sparsity issue by proposing a review-aware semi-supervised cotraining method named RSCF.Specifically,we use a factorization model to capture useritem review.Then,in order to build a model that is able to boost the recommendation performance by leveraging the review,we propose a semi-supervised ensemble learning algorithm.The algorithm constructs different(weak)prediction models using examples with different reviews and then employs the co-training strategy to allow each(weak)prediction model to learn from the other prediction models.The method has several distinguished advantages over the standard recommendation methods for addressing the data sparsity issue.First,it defines a review-aware factorization model that is more accurate for modeling the user-item preference.Second,the method can naturally support supervised learning and semi-supervised learning,which provides a flexible way to incorporate the unlabeled data.The proposed algorithms are evaluated on two real-world datasets.The experimental results show that with our method the recommendation accuracy is significantly improved compared to the standard algorithms and the data sparsity issue is largely allevilated.

Keywords/Search Tags:

Recommendation System, Semi-Supervised Learning, Co-training, Ensemble Learning, Unlabeled Data

PDF Full Text Request

Related items

1	A Study On Learning From Positive And Unlabeled Examples
2	Research On Semi-supervised Learning Algorithms Based On Ensemble Learning
3	Research On Multi-label Learning Algorithms With Ensemble Learning
4	Semi-supervised Learning Based On Information Theory And Functional Dependency Rules Of Probability
5	Biomedical Entity Relation Extraction Based On Semi-supervised Learning And Deep Learning
6	Research On Machine Learning Methods That Exploit Unlabeled Data
7	Research On Semi-Supervised Support Vector Machine Learning Methods
8	Semi-Supervised Learning Based On Ensemble Algorithm
9	Research On Semi-supervised Classification Algorithm Based On Integrated Neural Network
10	Research On Several Algorithms And Theories In Diversity-Based Semi-Supervised Learning