Font Size: a A A

Study Of Internet Public Opinion Hot-Topic Discovery Rased On Semantic

Posted on:2013-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:T TianFull Text:PDF
GTID:2268330425992625Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous progress of society, the internet gradually becomes another important platform of expressing public opinion for people. Network has a lot of characteristics such as spreading quickly, anonymous using and so on. Because of these properties, one side network has an supervisory effect on society public opinion; the other hand network information contain reactionary, superstition and obscene content, which will bring threat to social stability and national security. Therefore it has very vital significance that how to detect hot topics which netizen pay attention to and help government understand or guide the direction of network public opinion.This thesis is researching about how to find the hot topic of public concern from massive information by automated method. The method of finding the existing online public opinion hotspot is using the traditional text clustering techniques, but the traditional technology are not considering the relation between texts. So the result is inaccurate and also affects the hot topic analysis effect. Based on the present study status on hot-topic discovery, this thesis complies with following researching route:first is information collecting and web pages preprocessing technology which is how to collect and extract web information from loose unstructured data. This paper uses web crawler technology, webpage purification technology, and Chinese word segmentation technology to realize automatic acquisition and storage of network information. Second is topic discovery technology according to text clustering algorithm, acquisition and processing of text clustered into a set of documents that indicate the different topic, and the formation of each topic cluster. Combined with statistical methods establish model for hot topic discovery and analysis hot degree. The third is new method which builds a model frame based on semantic analysis hot topics and found that the and makes use of related semantic web and ontology technology knowledge to improve each sub module of text clustering and establish a model framework. The fourth is to compare between the traditional methods and improvement methods by experimental data from the precision and recall rate. The fifth is to understand the auxiliary modules of information collection and preprocessing, study the methods of traditional hotspot topic clustering, deeply analyze related knowledge about semantic, and designs a network public opinion hot topic detection system, realizes the function of collection and analysis of public opinion.This thesis mainly improves the performance of text clustering through the semantic analysis, improving the accuracy of the topic classification and topic analysis and proves the feasibility of the scheme by the experiments.
Keywords/Search Tags:network public opinion, hot-topic discovery, text clustering, semantic, Hownet
PDF Full Text Request
Related items