Font Size: a A A

The Research Of Distributed Topic Detection Method

Posted on:2013-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:J XiaoFull Text:PDF
GTID:2268330392469041Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid prog of social media, people interact with the Internet morefrequently and easily than before, and Internet resources grow explosively. The richInternet resources bring the convenience of social resources and the trouble to finduseful resources at the same time. To solve the problem above, this paper have aresearch on distributed topic detection. The research focus on optimizing theprecision and speed of topic detection. The main research achievements aresummarized as follows:First, Do the prophase survey and study the origin and correlation algorithmsof topic detection. Research the topic detection algorithm, and implement originalSingle-Pass algorithm.Second, This paper propose the secondary topic detection method namedDouble-Pass base on the Single-Pass. The aim of this algorithm is to let the clusterinformation of the first detection guide the second one. To strengthen the clusterinformation, the CFC-DP algorithm is come up with by applying the CFC theory,which is used to classification in other paper, and it is used to topic detection inthis paper. The experiment shows that Double-Pass topic detection is better thanprevious ones from the aspect of F value, and CFC-DP topic detection is better thanDouble-Pass.Third, To improve the speed of topic detection and meet the need of handlingvoluminous chunks of data, this paper propose a distributed topic detection methodbased on Hadoo, definite the operation of task decomposition and composition. Theexperiment shows that distributed topic detection can improve its speed greatlywith good F value at the same time.Finally, based on the above study, the prototype system of distributed topicdetection is designed and implemented. The prototype system includes thefollowing Five modules: Data reading modules, Data preprocessing module, topicdetection module, distributed procession module, topic saving and show module.
Keywords/Search Tags:topic detection, distributed, Hadoop
PDF Full Text Request
Related items