Font Size: a A A

Research And Implementation Of The Techniques Of Hotspot Analysis For Network Opinions

Posted on:2019-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShenFull Text:PDF
GTID:2428330545464758Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the Internet + era and the rapid development of Internet technology,the Internet has become the main channel for people to obtain information.Due to the high speed of network information transmission and the characteristics of large amount of information,how to find hot information network public opinion in a short time with a rapidly and accurately way,which becomes a hotspot in research of data mining and natural language processing.Based on the analysis of network public opinion hotspot technology,we designed and implemented the system of the hotspot analysis for network opinions.It mainly includes user login,system management,public opinion data collection,analysis of hot public opinion and public opinion results visualization.Among these five parts,the hot public opinion analysis as the core of this system function module,its function mainly includes the public opinion text pretreatment,public opinion topic found,hot heat evaluation of public opinion and public opinion topic keyword extraction.For the problem of feature selection in the preprocessing of public opinion text,this paper adopts a method based on the similarity of words association to select the feature words of public opinion texts.This method uses the semantic dictionary of synonym word forest to calculate the semantic similarity between words,it identify the synonyms in the feature sets,and then combine them with the weighted synonyms.In this paper,a hierarchical clustering method based on similarity threshold is adopted for the problem of public opinion topic discovery.This method firstly using cosine similarity calculates similarity of the texts,and then get the similar distribution curvet with the similarity of the texts,using the curve that is obtained by text similarity to calculates the minimum threshold,if the text similarity value is greater than the minimum threshold,then merge to the generate cluster,so as to realize public opinion topic discovery.In this paper,a heat factor analysis method is used to calculate the heat value of the topic.This method considers the number of reports related to the topic,the concentration of reports related to the topic,and the number of sources of reports related to the topic,from these three parts achieve the evaluation of the topic heat.For the hot keywords extraction problem,this paper adopts a keyword extraction method based on frequent feature word sets mining.The method using FP-growth algorithm to mining the frequent feature word sets,calculates the TF*IDF values of frequent feature word set items,and sort them according to TF*IDF values to realize hot keywords extraction.In this paper,we use three performance indicators to evaluate the accuracy of the public opinion hot spot analysis system in topic discovery and hot key word extraction,they are accuracy,recall rate and F value.The effectiveness of the method selected by the system is verified by comparing with the evaluation results of the other methods.
Keywords/Search Tags:Public Opinion Analysis, Hot Topic Discovery, Hot Key Words Extraction
PDF Full Text Request
Related items