Font Size: a A A

Automatic Detection Of Pseudo And Valuable Information From Internet Review

Posted on:2013-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:W J ChenFull Text:PDF
GTID:2218330362459270Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The goal of this research subject is just by use of the differences between the content of Internet reviews itself and related text fragments to get a more systematic key technologies to identify or evaluate pseudo-text information. We focus on Chinese review spam and valuable information detection. As the Internet application is more and more popular, Evaluative texts such as product reviews, forum posts, blogs and so on have become a valuable source of opinions on products, services and events. Meanwhile, a variety of review spams spread over. In this paper, we study this issue in the context of many different internet areas such as news reviews, forum posts, E-commerce and so on, in this mean, the task considered in this paper is domain independent. Mining linguistic characteristics, we propose some novel and feasible techniques, which include part-of-speech and word frequency features and sentence-level sentiment analysis, location mutual information computation between lexical items and mining subordinate relationship and parallel relationship among keywords. We then construct feature relationship-based tree model, and get feature granularity according to the tree depth of those features, and finally a supervised-based learning approach, C4.5 decision tree is built for the detection task. We report practical results in our experiment for automatic detection of review spam and valuable information.
Keywords/Search Tags:Opinion spam, feature mining, sentiment analysis, spam detection
PDF Full Text Request
Related items