Spam, Damn Spam, and Statistics: Using statistical analysis to locate spam web pages (2004)

PDF FILE – Commerce gives rise to spam, which in turn gives rise to the need for spam detection. These are the folks that totally misunderstood what spam is.

Most if not all of the SEO-generated pages exist solely to
(mis)lead a search engine into directing traffic towards the
“optimized” site; in other words, the SEO-generated pages
are intended only for the search engine, and are completely
useless to human visitors.

It’s almost as if these guys had a clue, but deliberately decided to ignore it. And it’s from 2004, so it is not like they couldnt find legit SEOs to help them with their paper.. Anyway, they go on to describe how ‘spam’ can be detected using statistical analysis.

Propagating Trust and Distrust to Demote Web Spam (2006)

PDF FILE – Trust dampening, trust splitting and my favorite, propagating distrust, (doesn’t that sound like a government idea?) are all covered in this paper that explores increasing the effectivenes of TrustRank. The abstract is enlightening:

Web spamming describes behavior that attempts to deceive search engine’s ranking algorithms. TrustRank is a recent algorithm that can combat web spam by propagating trust among web pages. However, TrustRank propagates trust among web pages based on the number of outgoing links, which is also how PageRank propagates authority scores among Web pages. This type of propagation may be suited for propagating authority, but it is not optimal for calculating trust scores for demoting spam sites.

Something that ‘propagates distrust’ for me is a little thing called, ‘nofollow’. It has quickly become the most abused tag on the web, easily surpassing the lowly keyword stuffed Meta.

An Analysis of Optimal Link Bombs

PDF FILE – You guessed it, a paper on how to make bombs. Link bombs that is. Anyone hear Carnivore go by?

We analyze the recent phenomenon termed a Link Bomb, and investigate the optimal attack pattern for a group of web pages attempting to link bomb a specific web page. The typical modus operandi of a link bomb is to associate a particular page with a search text and then boost that page’s pagerank. (The attacking pages can only control their own content and outgoing links.) Thus, when a search is initiated with the text, a high prominence will be given to the attacked page.

Now, where’s the Google research paper that effectively details the ways to combat link bombs?


  1. you are in point of fact a just right webmaster. The website loading speed is amazing. It kind of feels that you are doing any distinctive trick. Moreover, The contents are masterpiece. you have done a magnificent task on this subject!

  2. really useful Here’s some pass forward: Thought for the day? : Everyone has a photographic memory. Some don’t have film.

  3. site

    Hi there i am kavin, its my first time to commenting anywhere, when i read this article i thought i could also create comment due to this sensible paragraph.

  1. 1 tips hong kong

    tips hong kong…

    […]e I do agree with all the ideas youve presented in your post. They are very c eb[…]…

  2. 2 besok rom

    besok rom…

    […]4 Hey this is a good post. Im going to mail this to my friends. I stumbled on nw[…]…

  3. 3 "sida"


    “[…]i Zune and iPod. Most visitors compare the Zune to the Touch, but after seein uq[…]”

  4. 4 "www"


    “[…]w puppies and dogs are very cute, i always love to play with them during my s 5r[…]”

  5. 5 "have a peek at this site"

    “have a peek at this site”

    “[…]c Outstandingly educational countless thanks, It is my opinion your current v ri[…]”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: