Font Size: a A A

Research On Mixed Data Clustering Based On User’s Interest

Posted on:2014-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:X K WengFull Text:PDF
GTID:2268330401962381Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technology, in various fields has produced a lot of data. Especially in the area of electronic commerce produced many user information data, those are large-scale mixed properties data sets. To mining valuable knowledge and rules in these large scale data is one of the hotspots of research in the field of data mining. Clustering is an important technology in data mining, by looking for similarities between the data for data classification, found that implicit useful information and knowledge. It is a challenge task to find the information associated with the user interests from the large scale data with the user role has been greatly enhanced role in the information age. In this paper introduce user interest information in the data processing when study mixed data clustering. That can make data clustering results contribute to the recommendation of the information and the user’s behavioral decision. The main content of this paper includes the following three aspects:(1)In the case as user increasingly important role in the information age, introduced user interest information in the clustering process, a mixed data clustering label algorithm is proposed based on the concepts of users’ interest domain and "data-user’s interest domain" membership degree. The algorithm can effectively use the small-scale user interest information to the processing of large-scale mixed data clustering label.(2) To overcome data labels algorithm only intended for the tag data is assigned a class label limitations in clustering of mixed data processing. By threshold control to adjust the cluster label in UIMCL algorithm can achieve multi-label data samples treatment. The results obtained by the multi-tag clustering can be applied to e-commerce recommendation system and decision-making to improve the user’s behavior.(3) In the mixed data clustering distance metric, introduced user interest information, modify the distance metric calculation formula. Interest in distance metric introduced in different clustering algorithm, the same clustering effect can be obtained with the original algorithm. And clustering distance metric based on user interest data can be achieved dimensionality reduction, clustering results more in line with the attention of the user’s interest.For mixed data clustering process, the results of this paper is help to expand the user interest information to analyze the data and processing, and provided a reference to further expand in the clustering analysis techniques in the actual field.
Keywords/Search Tags:Mixed data, Clustering, User’s interest domain, UIMCLalgorithm, Distance metric
PDF Full Text Request
Related items