Font Size: a A A

Scoring Prediction Based On Improved Multi-label Distribution Learning Algorithm

Posted on:2020-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:G X LvFull Text:PDF
GTID:2428330575477785Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Although the emergence and popularization of the Internet have brought convenience to our life,the rapid development of the Internet has also led to the explosive growth of the amount of information on the Internet.This makes it more difficult for users to obtain information of interest.In order to solve the problem of finding the information you need in the mass of information quickly,the recommendation system comes into being.The role of recommendation systems is to help users quickly locate information of interest.There are many research directions of recommendation system,and the general classification from the big aspect includes recommendation based on collaborative filtering,recommendation based on content,recommendation based on mixed recommendation and recommendation based on score prediction.Each kind of recommendation system has its advantages and disadvantages.Among them,the common problem of collaborative filtering recommendation and content-based recommendation is the cold start problem.When there are fewer users or products,the recommendation effect is not good.Based on hybrid recommendation,the cold start problem is alleviated to some extent.Generally,grading recommendation is to study the attributes of commodities,and build a rating prediction model for commodities according to some algorithms,so as to make recommendations for users.The common scoring recommendation algorithms include the average method,the domain-based method and the matrix decomposition model.The problem with the matrix decomposition method is that the scoring matrix is very sparse and more than 95% of elements are missing.Another disadvantage is the high computational complexity of the matrix decomposition method.Although the method of average value is relatively simple,the results of recommendation may be unreasonable in the case that the score distribution is sparse,that is,the number of users with high and low scores for products is relatively large.According to above average method and matrix decomposition to score predicts there is a problem,we have chosen to use multiple tags distributed learning to score predicts,multiple markers distributed learning algorithm for the problems existing in the method of average to very good solution,multiple tags distribution grade high grade and low in most of the products is recommended more than grade medium goods,because it is itself with the topic.Multi-label distribution is put forward in 2016,new learning paradigm for how to study the distribution of the marking algorithm is relatively small,there are three kinds of the existing strategy conversion problem,transformation and specialization algorithm,algorithm efficiency is better than the former two specialized algorithm,for the first two such problem algorithm was modified to accommodate multiple tags or distribution,and specialization algorithm is proposed for multiple tags distribution,don't need to adapt or transformation.This article is to improve on existing specialized algorithms,each instance corresponding to a distribution,multiple tags distributed learning is assuming a forecasting model,using the forecast model and similarity or distance between the real distributions as the loss function or objective function,to study the prediction model by data sets,then can use this model to predict the distribution of unknown instance.But distribution of existing multi-label learning algorithm did not consider the instance itself in the process of prediction distribution characteristics,some distribution quite gentle the instance of each value is the same as the tag,and some distribution is sparse or instance of each tag the value of the difference is larger,there is only a few token value is very big,most of the rest of the tag value is small,for distribution has the characteristic,for instance the distribution is sparse,we can improve the objective function,makes in the prediction process,also makes the predicted distribution is sparse.The first improvement is to add a factor to the predicted objective function to make the predicted instance distribution more sparse and more similar to the real distribution so as to improve the accuracy of the prediction.The improved algorithm is called p-iis-ldl and p-bfgs-ldl.Among them,p-iis-ldl is an improved algorithm of the original iis-ldl,while p-bfgs-ldl is an improved algorithm of bfgs-ldl.The accuracy of the improved algorithm is improved to a certain extent through verification in the real distributed dataset Movie,dataset JAFFE and dataset SBU3DFE.Another improvement point of this article is on the distribution of the original multi-label learning algorithm of distance functions using the K-L divergence,existing said distance function or similar function has a lot of,where D divergence distance functions in other better experiment results,the original algorithm of distance functions into D divergence to improve the accuracy of prediction,in the above mentioned three do test data sets,found that the improved D-IIS-LDL and D-BFGS-LDL has certain improvement in accuracy.In the rating recommendation system,books or movies are often taken as examples,and the improved multi-mark distribution is applied to film rating and book rating.
Keywords/Search Tags:Machine Learning, Score Prediction, Multi-label Distribution, Sparseness of Data Distribution
PDF Full Text Request
Related items