What Is A Citation-Based Ranking System?
First, let me tell you what is isn’t. It is not a voting system. I’ve been guilty in the past of comparing links to votes, but it simply isn’ t so. First, in most voting systems, votes are considered equal. Links are far from being equal. In addition, when someone votes for someone, they are indicating their support. When someone links to a page, it can be for a variety of reasons, including displeasure, or to point out an error in fact, for comparison, because they’ve been paid, to provide another resource, etc.
The citation system for search is based on academic peer-reviewed papers. The ‘importance’ of peer-reviewed papers is often equated to the number of citations the paper receives. For instance, it is difficult to write about PageRank without citing ‘Bringing Order To The Web”. It is a seminal paper and cited often.. However, it is important to realize that every citation the paper receives isn’t a vote, or an indicator of trust in the veracity of the document. Papers are often cited by others because the paper contains an error of fact or the author wants to draw a comparison or simply wants to challenge the document within his own paper.
With peer-reviewed papers, the citation system works extremely well. The most important papers receive numerous citations in other works and the ‘importance’ of the paper is thus established. Google simply took this system and applied it to links.
The shortcomings of a citation-based system applied to links become apparent when ‘importance’ is involved. Determining the importance of a Britney Spears site using links as the determining factor is an exercise in futility. Further, within the academic community it would, or least should, be anathema to cite a paper in return for payment. Additionally, the citation-based system is notoriously incestuous. Citations are rarely made outside of the expertise of the writers fields. In short, physicists don’t often cite papers on biochemistry and economic analysts don’t cite papers on theology. For this reason, citations in the academic world have an intrinsic value because educated opinions are being traded.
When the citation system is applied to the Web, several problems occur. The first, is the assumed importance of off-topic links. A link from a PR8 authority page on linguisitics for example, should be considered very important if the link points to a page on a site about neologisms. However, that same link pointing to say, a site about quilt making should carry very little, if any importance. This is the reason SEOs recommend getting on-topic links. This assumes some contextual and/or link analysis is being perfomed. Given that assumption, it then becomes difficult to explain why Google Bombs are so effective.
One of the solutions to this problem is cluster analysis. When the Web map is anlayzed it becomes apparent that links are traded frequently between sites that share topics. Take a quick look at SEO blogs and you’ll notice the incestuous link patterns that appear. The problem then becomes determining the breadth of topics that are related. Is marketing related to SEO? What about copywriting? Or direct marketing? How closely is SEO related to SEM?
Another problem that exists in the citation-based system as applied to the Web is the popularity factor. Popularity is equated to importance. A charismatic blogger may receive many links but the importance of the blog content may be quite minimal.
And what about the ease with which links can be created. In an academic setting, papers have to be created in order to cite another paper, and usually, a great deal of preparation and work is involved in creating a paper, after all, it will be reviewed by their peers. Links do not require much effort to create.
And, there’s the issue of the importance of links versus the number of links. I can guarantee you that a single link to my ‘Google Papers” page, from Google’s homepage, with “Google Papers” in anchor text would assure my page the number one spot in Google, Yahoo and MSN for the query. But does that make my page the most important or worthy? Hardly.
Everything I’ve mentioned is obvious to SEOs and search engine engineers alike and you can be sure that the engineers are working on the problems. But in what direction are they moving? More refined contextual analysis? LSA? A better link value deprecation system? Better clustering analysis? Pattern analysis? All of the above? Where do you think search is headed?