Font Size: a A A

Research On Recommendation Algorithm Based On Constant Complexity Distance Function

Posted on:2018-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhengFull Text:PDF
GTID:2358330515953946Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The recommended system is to analyze the user's historical behavior,take the initiative to recommend to the user they may be interested in the information.It can greatly shorten the user screening information time,which is widely used in e-commerce,film and video networks,social networking and other fields.There are many methods of recommendation system,the most widely used is the collaborative filtering recommendation algorithm.The core idea of the algorithm is to find objects that are similar to the target hobbies as the neighbors of the sample,and then to predict the behavior of the target by analyzing the behavior of its neighbors.A large number of distance metric functions can be used to find the target neighbors,such as cosine similarity,Pearson correlation coefficient,Euclidean distance.However,the calculation process of these distance metric functions is quite complex,when the data scale is large,the distance calculation will be very time-consuming.This paper proposes a collaborative filtering recommendation algorithm MBR(M-distance based recommend)based on the average distance.The algorithm first defines a new distance function M-distance,the complexity of the function is only a constant.M-distance takes the average score difference between the object and the sample as the distance.When the average score of the object and the sample is known,the time complexity of calculating the distance between the two is only O(1).Secondly,the method of finding neighbors by radius is put forward.After calculating the distance between the objects,the kNN algorithm takes the nearest k object as the neighbor of the sample,but the method of setting the number of neighbors in advance is not flexible enough.The neighborhood of radius is used to determine the neighbor region of the sample by the average rating and 8.When the distance between the object and the sample is in the neighborhood domain,then the object is chosen as the neighbor to be selected.The number of neighbors chosen by such a method depends entirely on the degree of similarity between the object and the sample.The higher the similarity is,the more neighbors are chosen and the more accurate the prediction is.Then the method of scoring prediction is defined.When all the neighbors of the sample are selected,the behavior of the user can be predicted based on the user's behavior towards the neighbor.When there is a neighbor,the average score of the neighbor is taken as the final predicted value of the sample,and the average score of the user's sample is taken as the final predictor when there is no neighbor.Finally,a recommended threshold is defined on the basis of the MBR recommendation algorithm.When the calculated value of the MBR algorithm is greater than the recommended threshold,the sample is recommended to the user,and vice versa.Threshold is mainly determined by the false recommendation rate and the cost of false recommendation.This article selects four commonly used public data sets:MovieLens,DouBan,EachMovie,Netflix to test.This paper mainly compares the advantages of MBR recommendation algorithm and kNN and Slope One algorithm in operating efficiency and prediction accuracy.Through a large number of experiments we can see MBR recommendation algorithm to ensure the accuracy of the premise of greatly improving the recommended efficiency,especially in large data sets on the performance is more prominent.And the optimal threshold is typically between 3.4 and 3.5.
Keywords/Search Tags:Distance measure, neighborhood, computational complexity, recommender systems
PDF Full Text Request
Related items