| With emergence and popularization of Internet,people’s demand for information has been fulfilled.However,in the wake of rapid development of Internet,people are failed to figure out their interested information accurately,whose surroundings are full of overmuch information,which is known as the problem of Information Overload.It could be solved by SE,such as Google and Baidu.However,the search result in SE is just the same,when different users have input the same content into it.As treated equally,different users cannot get their personal results and SE cannot solve Information Overload effectively.And personalized recommendation not only consider the user’s input,but also consider the user’s other information(such as ratings)and so has the personalized results for different user input,RS is a better way to solve this problem,whose core is recommendation algorithm.The most widely used collaborative filtering algorithm in the field of engineering requires similarity measures to find target users’ closest neighbors and all of the existing similarity measures depend on co-rated items,so these similarity measures will not be able to achieve satisfactory results due to few co-rated items in in the case of sparse data.The scale of the supermassive data in the field of engineering also limits its extensional ability and the cost of time will increase with the with the growth of the data.In order to solve the problem above,collaborative filtering algorithm had been put forward based on Bhattacharyya coefficient in this paper to solve the sparsity problem.And by integrating with clustering algorithm and carrying out pre-processing upon users,before the collaborative filtering algorithm had been executed.As a result,this method improves the algorithm’s extensional ability.Main contents in this paper are as following:First: Bhattacharyya coefficient is used to cover the sparse problem.Bhattacharyya coefficient is utilized to overcome the problem of traditional similarity measures,which rely on co-rated items to much.Second: use the improved k-means algorithm to improve extensional ability.It is difficult to use traditional k-means upon sparse data and in order to achieve this target,in this paper the numbers of items rated to optimize the distance and silhouette coefficient to find the initial cluster centers.And then,combing with the improved collaborative filtering algorithm to improve extensional ability.Third: The experiments turn out the validity of the algorithm.In this paper two experiments are conducted,one is to verify the effectiveness of the cooperative filtering algorithm based on the Bhattacharyya coefficient to alleviate the sparse problem.Second,it is based on the experiment of the k-means and the collaborative filtering upon Bhattacharyya coefficient to verify the validity of the expansion problem.From the results of the experiment,the algorithm of this paper greatly reduces the dependence on co-rated items,and solves the sparse problem very well,and improves the extensibility. |