Font Size: a A A

Combating Link Spam Using Limited Label Propagation

Posted on:2014-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:N MuFull Text:PDF
GTID:2248330395999158Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nowadays web surfers mainly rely on search engines to achieve information from the Internet. However, the development of search engines is faced with the challenge from search engine spam. One general definition of search engine spam is "the use of some purposely designed mechanisms to raise the ranking of web sites or pages in the search engine results" Websites which carry on search engine spam are called web spam. In order to achieve their goal, spammers will carefully study the ranking algorithms to find ways to take advantage of them. Therefore, search engine spam is the most critical challenge before search engines. Without taking action, results from search engines will be disturbed greatly. Less people will trust search engines. As spam techniques evolve with variable methods, together with the greatness of web data, the work of anti-spam will be very hard. Since search engine emerged, people have come up with diverse approaches to cope with search engine spam.This article illustrates common spam techniques in category, including content spam, link spam, as well as page-hiding spam. In addition, this article describes existing effective solution to search engine spam. Besides, after analyzing the disadvantage of existing anti-spam algorithms based on label propagation, a limited label propagation method is proposed. The proposed algorithm limits the propagation of trust label and distrust label to improve previous approachesExperimental results on real data set indicate that limited label propagation method is more effective than baselines, so that it can significantly improve anti search engine spam quality.
Keywords/Search Tags:Web Spam, Label Propagation, Community Recognition
PDF Full Text Request
Related items