Font Size: a A A

Research Of Microblog Hot Words Based On New-word Identification And Time Period

Posted on:2016-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:C HuangFull Text:PDF
GTID:2308330476953455Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Micro-blog hotspots are topics that attracts large number of users of micro-blog in a period of time. Research in micro-blog hotspots can help Internet users timely learn what’s hot and what’s happening, help enterprises to understand and evaluate its commercial reputation and situation of their opponents, and also gives the government a clear direction to the social public opinion. Therefore r the study of micro-blog hotspots has a wide applicability and great value.For micro-blog texts are free of words, not standardized in grammar, and have newest information, it is hard to analyze micor-blog texts with traditional methods. Especially traditional word-segmentation tools performance bad in dealing with such texts. Hadoop cloud computing platform is adopted in this paper for new-word identification of micro-blog texts, by constructing new-word dictionaries and the optimizing word-segmentation result. Then this paper try to find hot words in micro-blog texts and divide them into different groups according to the characteristics of time period. Finally this paper try to judge sentimental polarities of micro-blog texts according to micro-blog expressions and a CRFs classifier.The experiments shows that methods taken in this paper have achieved good effect and have value for further study.
Keywords/Search Tags:hot-word identification, optimization in word-segmentation, Hadoop cloud computing platform, characteristics of time period, hotspot classification
PDF Full Text Request
Related items