Font Size: a A A

The Research On IWOM Monitoring System Based On Web Text Mining

Posted on:2011-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y C ShaFull Text:PDF
GTID:2178330332979606Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The IWOM (Internet Word of Mouth) refers to the opinions about the companies or products or services through the BBS, blogs and other network channels. With the rapid development of internet technology, the IWOM will directly influence the creditworthiness of the company. In the increasingly complex network environment, IWOM monitoring has become a more and more urgent job of the government and company. Constructing an IWOM Monitoring System (IWMS) can effectively help the company keep the IWOM under control.This thesis researches the technology of the IWOM monitoring based on web text mining, and gives a design model of IWMS. In this System, Metasearch is used to collect IWOM data from the internet. And then web analysis technology, Chinese word segment technology and feature extraction technology are used to pretreat the IWOM data. At last web text mining technology is used to make the IWOM data cluster, and extract the text orientation.At first, this thesis discusses the IWOM data collection technology, reviews the search engines'development and classification. The metasearch technology is chosen to collect IWOM data based on the demand analysis of the IWMS.After the discussion of IWOM data collection, this thesis discusses the technology of web page pretreatment, which includes web analysis, Chinese word segmentation and feature extraction technology. In this thesis, regular expressions is used to extract the content of the web page, ICTCLAS algorithm is used to implement Chinese word segmentation, TFIDF algorithm is used to extract the feature.The key points of this thesis are focused on the research of text clustering techniques and text orientation techniques. K-Means algorithm is improved by the semantic model of Hownet and multiple-sampling Method algorithm. And emotional dictionary based on Hownet is used to measure feature words and the orientation of text.At last, a design model of IWMS and implementation of the archetype of IWMS are proposed.
Keywords/Search Tags:IWOM Monitoring, Feature Extraction, Text Clustering, Semantic Orientation Analysis
PDF Full Text Request
Related items