Font Size: a A A

Research On Hot Character And Event Analysis Techniques Oriented To Public Sentiment Monitoring

Posted on:2013-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z L SunFull Text:PDF
GTID:2268330392967945Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the worldwide popularity of the Internet, the Internet has become thecenter of the ideology culture and the amplifier of public opinions. Public opinioninformation reflects the state of public mind, and the study of public opinioninformation is extremely important under the powerful spread of Web2.0. Facedwith a flood of information updated daily, how to dig out the hot news and publicopinion trends efficiently and accurately has become an urgent problem. Generallyspeaking, the occurrence and development of events are related to characters, andthe expansion of many hot events is influenced by characters. In this context, wetake hot charactor analysis as a starting point, find and analyze the events thathappen to hot charactors, to grasp the public opinions of the network. Centering onanalysis techniques of the hot charactor and event, our study involves the followingaspects:(1) This paper presents a name recognition method based on a combination oflexical anaylis results and a name disambiguation method based on Lingo clusteringstrategy. We first use the existing lexical analysis tools to mark names, and integratethe results based on maximum length principle. At the same time, we try severalmethods to remove noise names, and do name disambiguation based on the Lingoclustering algorithm. Experiments show that the integration strategy improves therecall of name recognition without reducing the precision of name recognition, andthe noise reduction method for names and the name disambiguation method canmeet the application requirements.(2) This paper studies supervised charactor classification techniques andproposes a charactor classification method based on SVM. We first extract fixedlength text fragments which could describe character from the text, then use theinformation gain to extract useful character attributes which can represent him or her,and finally use the SVM algorithm to classify the characters. Experiments show thatthis method can effectively predict the category of the character.(3) We research the feature extraction technique based on the combination ofinformation entropy and emotional dictionary, and use it to analyze hot charactorand event sentiment. Information entropy measures the distinguishing ability of thefeatures, and emotional dictionary solves the coverage problem. Features in thispaper are extracted from the training set and the emotional dictionary seperately.Features from the training set are related to the corpus, or some field. Emotionaldictiony is universal, which contains the features that the training set doesn’t contai n.Experimental results show that the feature integration can effectively improve the performance of the event tendentious analysis. Meantime, this paper attempts tocluster the candidate features set using synonymous word dictionary. Synonyms aremapped to one feature, which reduces the dimension of the space vector withoutlosing senmatic information, and improves the accuracy of semantic similaritycalculation. Joining the synonyms of features in the feature clustering processachieves the effect of important feature expansion and improves the featurerecognition capability in the analysis process of event sentiment.(4) This paper proposes hot charactor scheduling model oriented to publicopioion monitoring. The model considerscharacter exposure rate, hot degree trendand field weight to calculate the score, and then generates the hot charactor ranklist.Character exposure rate is number of the news and commentaries which contain thecharacter in a single day; the hot degree trend can be measured by the deformationof the KL distance; the field weight is set according to the field’s importance degreein monitoring public opinion of the character, and the field of character can bepredicted by automatic character classification technology. Experimental resultsshow the hot character scheduling model can put important characters in monitoringpublic opinion on the front of hot character ranklist.
Keywords/Search Tags:public sentiment monitoring, person name recognition, characterclassification, trend analysis, hot charactor sorting
PDF Full Text Request
Related items