Font Size: a A A

Research On Hot Topic Discovery Of Network Education News Based On Distributed Framework

Posted on:2019-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:D LiuFull Text:PDF
GTID:2428330548483453Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasing concern of the public,education has become one of the hot topics in the society.In the rapid development of the Internet and the arrival of the era of large data,the traditional topic detection methods have been unable to quickly obtain the hot topics of education from a large number of news data.Distributed computing framework with high-speed processing of data has important theoretical value and practical significance in the field research on hot topic of education.A series of theories and algorithms for hot topics in online education have been studied in the article.Through the secondary development of Nutch,the preprocessed text data is achieved.For the traditional vector space model without eliminating semantic impact,the topic model LDA that can discovery latent semantic relations is applied to text data modeling.A multi-level clustering method combining Single-Pass and Chameleon algorithms was proposed,which is used to discovery hot topics.Hadoop distributed framework is applied to implement hot topic,discovery methods and increase data processing speed for the status of mass data news in today's networks.Through theoretical proofs and experiments on the efficiency of multi-level clustering algorithm,the number of different nodes and different data amount of the algorithm running time,the accuracy of the ranking of heat value,the effectiveness of the proposed method is verified.
Keywords/Search Tags:Hadoop distributed framework, Hot Topic, LDA, Multilevel clustering algorithm
PDF Full Text Request
Related items