Font Size: a A A

Design And Implementation Of Topic Discovery Sub System For Stock System

Posted on:2016-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhouFull Text:PDF
GTID:2308330479991527Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the amount of data on the internet become more and more huge, the age of big data is coming. Computer area will also transform from IT(information technology) to DT(data technology). The example of using big data analysis method to handle and improve people’s daily life has become more and more common. Big data is changing and improving people’s life. It makes people’s life become more convenient and more intelligent.The traditional method to handle topic detection is firstly to filter the data with rules, then let editorial staff to read the data, finally editorial staff will give the hot topic amount the data. The traditional method will cost a huge amount of human resources and the update of the information is slow. With the amount of data become more and more huge, traditional method is reaching its bottleneck. To solve the problem of the traditional method, in this paper, by using big data analysis method a kind of automation topic detection method will be given out.In this paper, firstly a web spider will be implemented for collecting financial news pages on the internal, then using page extraction method to preprocess the web pages, after that the output of the page extraction will be clustered, in this step the page that contain the same topic will be clustered into the same cluster, finally according to analysis the cluster data the hot topic of the data can be given out. In this topic detection method, the knowledge of web spider, page extraction, lda cluster, distributed computing, naive bayes is required.With the adoption of the topic detection method in this paper, the human resources of the company has been saved and the time for topic detection has become shorter. All this prove that the method in the paper is useful.
Keywords/Search Tags:naive bayes, lda, topic detection, web spider
PDF Full Text Request
Related items