Font Size: a A A

Sina Micro-blog Users' Feature Analysis Based On SSLOK-means Clustering Improved By Calinski-Harabasz

Posted on:2020-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y TangFull Text:PDF
GTID:2428330575473823Subject:International business
Abstract/Summary:PDF Full Text Request
As the mainstream social network platform,Sina Micro-blog is also the main channel for all kinds of information release.With its characteristics of real-time,open and concise,it has contributes to a huge user base.Sina Micro-blog's user activity is in the leading position in the domestic social network platform.The data generated by users in the platform is continuously accumulated,and the social big data formed can provide data support for business decision-making,but it also causes information overload problem while generating massive data.In the face of complicated data,it is more and more difficult for users to find information and content that accord with their interests and preferences,which greatly reduces the efficiency of big data utilization and affects the user experience.It is the key to improve the information overload problem to analyze and study the user characteristics by using the data in microblog,and then provide users with high-quality personalized recommendation.In order to effectively utilize the value of massive data,data mining technology emerged as the times require.As a data mining algorithm,clustering algorithm has been widely used in social networks,providing new methods and ideas for microblog operators to analyze user data.K-means is one of the commonly used methods of cluster analysis.but when it analyzes large-scale data,it will face the problem of low clustering efficiency.The newly proposed SSLOK-means clustering algorithm solves the defect of K-means.However,the algorithm needs to manually set the number of clusters in advance,which hinders the convenient use of the algorithm.The appearance of Calinski-Harabasz validity function improves the deficiency of K-means which needs to set K value in advance.Based on the improved SSLOK-means clustering algorithm by Calinski-Harabasz function,this paper applies it to Sina Micro-blog users' data,analyzes users' characteristics,segments users,discovers different interest groups,and targets each type of users.The characteristics of the discussion are discussed,and corresponding personalized recommendations are proposed.The data of the micro-blog user is obtained by using the network crawler technology,preprocessed and standardized,structured.And then use the improved algorithm to cluster the sorted user data.Finally,six groups of interest groups with obvious similar characteristics are found:interest balance,literature and art,sports,fashion,IT and career type,and corresponding personalized recommendation suggestions are put forward for different types of interest groups.The research provides support for the optimization of personalized microblog services,the promotion of marketing revenue,and reference for other social networking platforms,which has strong theoretical significance and practical value.For micro-blog operators,high-quality personalized recommendation can increase users' dependence,activity,and stickiness on the platform,and further increase platform dividends to enhance the competitiveness of the platform,thereby maximizing the economic benefits and reputation of the platform.From the perspective of the businessman,the platform can provide user characteristic information to them,which provides new ideas for the efficient and accurate advertising of the merchant.From the user's point of view,as a user,when using microblog,it is usually desirable to be recommended to meet the needs of their own information,which not only saves the time of information search,improves the efficiency of using the platform,but also enhances the user experience.Therefore,it is of great guiding significance to analyze and study the users' characteristics of Sina Micro-blog to grasp user interest preferences,improve products and services,and provide more accurate personalized services for users and other social network platforms.
Keywords/Search Tags:Sina Micro-blog, Users' feature analysis, Semi-supervised clustering, CH function
PDF Full Text Request
Related items