Font Size: a A A

Research And Application Of Multi-View Clustering Method Of News Data Mining

Posted on:2022-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2518306575463674Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis of multi-view news data can fastly obtain valuable information,and can get good application effect in public opinion analysis,personalized news recommendation,emotion analysis,early warning and other fields.The current multiview news data clustering analysis has the following problems:(1)The multimedia information of news content,such as text,picture,audio and video,is described by different semantic levels and different granularity content concepts.If study views equally,it will seriously affect the data mining performance.(2)So far,most of the methods have been experimented on the basis of assuming that the views are complete.However,it is difficult to obtain each view of the news data due to the complex application,so the traditional methods can not cluster the incomplete multi-view data well.Aiming at the above problems,this thesis proposed two method.One is called “Multiview clustering of mixed-granularity news data”,the other is “Incomplete multi-view clustering of news data”.The main research work of this thesis is as follows:1.Propose multi-view clustering of mixed-granularity news data.In view of the fact that the existing clustering analysis methods of news data do not consider the differences of granularity and importance of different views,the existing methods are optimized.Firstly,feature selection is used to unify the mixed-granularity features of each view to the same label granularity.Then,each view is weighted and fused by entropy to reduce the effect of the view corresponding to the feature space whose classification is not clear.k-means method is used for cluster the multi-view news data finally.2.Propose multi-view clustering of incomplete news data.In view of the fact that most of the existing clustering analysis methods are based on the assumption that each view is complete,and can not deal with incomplete view data,the existing methods are improved.Based on the multi-view clustering of mixed-granularity news data,the similarity matrix of incomplete multi-view data is calculated.At the same time,obtains the subspace matrix of the original data similarity matrix with the help of nonnegative matrix factorization,so as to reduce the impact of inaccurate filling,it unifies the filling and clustering process of missing view data,updates the missing view data iteratively through other observable view data.3.News hotspot discovery application based on multi-view clustering.Aiming at the problems of wide platform,complex types,large amount and fast speed of news dissemination in the application scenarios of hot topic discovery based on network news data,this paper applies IMVCN to the actual hot topic discovery of news,and clusters the network news data,it effectively digs out the hot topics of news.
Keywords/Search Tags:multi-view clustering, news data, mixed-granularity, view missing
PDF Full Text Request
Related items