Font Size: a A A

Research On The Detection Platform Of Sensitive Topic In Internet-Mediated Public Sentiment

Posted on:2010-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y FengFull Text:PDF
GTID:2178360275473245Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
As an important communicating channel,the information carried and transmitted by the internet,especially the sensitive topics,seriously influences the formation and dissemination of public opinion,and it poses inestimable latent security threat.Therefore,the initiative detection technology of sensitive topic is urgently needed.The Detection Platform of Sensitive Topic in Internet-mediated Public Sentiment conformed to main techniques of network information analysis and processing,completed segmentation and the structured storage of processed network information,and realized the detection of sensitive topics in Internet-mediated public sentiment.This thesis designed and realized the detection platform based on the match of segmentation results and sensitive words.To the word segmentation module of the system,this paper brings forward an approach for Chinese lexical analysis using Cascaded Hidden Markov Model(CHMM),which aims to integrated Chinese word segmentation,disambiguation,unknown word recognition and part-of-speech tagging into one theoretical frame.Then the system realized the sensitive word management through the single link data structure and the serializing way,thus ensuring the integrity and transitivity of the database.To the detection of the sensitive topic,with a thoroughly retro perspective,the system matches the processed topics with the sensitive words,and that is,inquiring the data table of sensitive topics with the segmentation results and then distinguishes the sensitive topic,and this method increases the efficiency of the detection.Base on the work above,this thesis makes a preliminary exploration to improve the capability of the system,which includes the following:compared the recall rate and the precision of segmentation using Full Second-order Hidden Markov Model(FHMM2) and Hidden Markov Model(HMM) through the experiment,the paper comes to an conclusion that FHMM2 has an obvious advantage in the statistics effectiveness and accuracy;based on the improvement of existing segment dictionary,it put forward a segment dictionary basing on Four-character Hash Mechanism;aiming at detecting the sensitive topic using semantic information,it present detection of sensitive topic and evaluation of sensitivity using Latent Semantic Indexing and key words.Summing up all the work,the paper designed and realized the Detection Platform of Sensitive Topic in Internet-mediated Public Sentiment.Processing a testing run in then environment at the laboratory and in campus network,the system turned out to be efficient and stable.
Keywords/Search Tags:Internet-mediated Public Sentiment, Sensitive Topic, Chinese Segmentation, Cascaded Hidden Markov Model(CHMM), Full Second-order Hidden Markov Model(FHMM2), Latent Semantic Indexing(LSI)
PDF Full Text Request
Related items