Combating Web Spam Based On Both Trust And Distrust Propagation

Posted on:2012-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2218330368988061

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As the rapid development of Word Wide Web, search engines become the dominant way for people to find useful information on the Web. Since higher ranking in searching results brings more traffic, and more traffic means more profit to the owners of Web sites. It drives some Web sites owners to manipulate ranking results of search engines through unethical methods. This kind of unethical manipulation is termed as Web spamming. Web spam will not only waste resources of search engines, but also decrease the experience of users. Commercial search engines have to take measures to eliminate the negative effect of spam.Recently, anti-spam algorithms based on trust or distrust propagation is widely used to combat Web spam. Anti-spam algorithms based on trust or distrust propagation is more robust to the attack of spammers and more efficient on computing because of only dealing with page links than that based on contents or heuristic rules. However, existing trust or distrust propagating algorithms all have two serious issues. On one hand, trust/distrust is propagated in non-differential ways, that is, it threats the authorities and the spam pages alike in the propagating process. One the other hand, it has been mentioned that a combined use of good and bad seeds can lead to better results, however, little work has been known to realize this insight successfully.The proposed TDR algorithm in this paper, views that each Web page has both a trustworthy side and an untrustworthy side, and assigns two scores to each Web page:T-Rank, scoring the trustworthiness, and D-Rank, scoring the untrustworthiness. From good and bad seeds, TDR simultaneously propagates T-Rank through links and D-Rank through inverse-links, respectively. In the propagating process, the propagation of T-Rank/D-Rank is penalized by the target's current D-Rank/T-Rank. In this way, propagating both trust and distrust with target differentiation is implemented and the above mentioned two problems are solved. Experimental results on WEBSPAM-UK2007 datasets and ClueWeb09 datasets show that TDR outperforms other typical anti-spam algorithms under various criteria.

Keywords/Search Tags:

Web Spam, Trust Propagation, Distrust Propagation

PDF Full Text Request

Related items

1	Research On Personalized Recommendation Of Trust Propagation Model Under Social Network
2	The Recommendation System Of A Distrust Propagation Algorithm Based On The Breadth First Traversal
3	Combating Search Engine Spam Using Community Discovery
4	Trust Semirings Model Based On Similarity For Trust Propagation
5	Combating Link Spam Using Limited Label Propagation
6	The hybrid model of trust and distrust: Extending the nomological network
7	Research Of Subjective Information Propagation In Social Network
8	Research On Secure Network Coding Method Based On Trust Propagation
9	A Research On Trust And Distrust-based Collaborative Filtering Recommendation Model
10	Research On Personalized Pre-trust Based Trust Management