PDF FILE – Commerce gives rise to spam, which in turn gives rise to the need for spam detection. These are the folks that totally misunderstood what spam is.
Most if not all of the SEO-generated pages exist solely to
(mis)lead a search engine into directing traffic towards the
“optimized” site; in other words, the SEO-generated pages
are intended only for the search engine, and are completely
useless to human visitors.
It’s almost as if these guys had a clue, but deliberately decided to ignore it. And it’s from 2004, so it is not like they couldnt find legit SEOs to help them with their paper.. Anyway, they go on to describe how ‘spam’ can be detected using statistical analysis.
PDF FILE – Trust dampening, trust splitting and my favorite, propagating distrust, (doesn’t that sound like a government idea?) are all covered in this paper that explores increasing the effectivenes of TrustRank. The abstract is enlightening:
Web spamming describes behavior that attempts to deceive search engine’s ranking algorithms. TrustRank is a recent algorithm that can combat web spam by propagating trust among web pages. However, TrustRank propagates trust among web pages based on the number of outgoing links, which is also how PageRank propagates authority scores among Web pages. This type of propagation may be suited for propagating authority, but it is not optimal for calculating trust scores for demoting spam sites.
Something that ‘propagates distrust’ for me is a little thing called, ‘nofollow’. It has quickly become the most abused tag on the web, easily surpassing the lowly keyword stuffed Meta.
PDF FILE – You guessed it, a paper on how to make bombs. Link bombs that is. Anyone hear Carnivore go by?
We analyze the recent phenomenon termed a Link Bomb, and investigate the optimal attack pattern for a group of web pages attempting to link bomb a specific web page. The typical modus operandi of a link bomb is to associate a particular page with a search text and then boost that page’s pagerank. (The attacking pages can only control their own content and outgoing links.) Thus, when a search is initiated with the text, a high prominence will be given to the attacked page.
Now, where’s the Google research paper that effectively details the ways to combat link bombs?