| With the rapid development of Internet and information technology, the information presented to the users on the Internet shows an explosive growth. Vast amounts of network information meets diverse information needs of users, but the problem of information overload it brings has become increasingly serious. Information retrieval and information filtering technology are the primary means to solve the information overload problem. Recommendation system, as a typical representative of information filtering technology, is becoming one of the most effective ways to solve information overload problem in recent years.Since the birth of recommendation systems, a lot of good recommendation algorithms have been proposed to commercial recommendation systems. Collaborative filtering recommendation algorithm is one of the core recommendation algorithms and it is also the most widely used algorithm. Although it has been paid wide attention in industry and academia, there still exists some problems need to be solved. Traditional user-based collaborative filtering algorithm has some shortcomings as followed: data sparse, poor expansibility and inaccurate similarity calculation. These problems are studied in this paper and some improved measures have been proposed. On the basis of these measures, an improved user based collaborative filtering algorithm is proposed. Specific research work include:Research on the problem of data sparse. The sparse score matrix with missing values are studied in this paper. Effective measures are taken to fill the missing values. Then we put forward the method of random sampling based on integrated user and project differences and the method of random filling based on similarity weighting of items. Compared with the traditional means of filling, the results show that two methods proposed in this paper can effectively improve the accuracy of recommendation.Research on similarity computation between users. In this paper, the characteristics of similar samples in collaborative recommendation are studied. On the basis of improving the similarity calculation formula of Pearson, the similarity calculation method based on variable weight is put forward. This method performs better than the traditional Pearson similarity formula and the modified Pearson correlation calculation formula proposed by Herlocker in the experiment.Research on the difference of people’s interests and scalability of algorithm. Users based collaborative filtering algorithm can’t distinguish users’ preferences from different styles of items. Thus, it affects the accuracy of the recommendation seriously. In order to solve this problem, a search method of project classification based on nearest neighbor is proposed. Its principle is to divide the project to complete the segmentation of the user- item score matrix. The experimental results show that the improvement measures can effectively improve the accuracy of the recommendation.Propose an improved user based collaborative filtering algorithm. We put forward an improved user based collaborative filtering algorithm in the foundation of the above three improvement measures mentioned above. The result of comparative experiments show that it can significantly improve the accuracy of the recommendation. |