If you want to understand PageRank – really, really understand it – a new paper from David Austin will help you do just that. The trouble is, you’ll need to be comfortable with some fairly high-level math in order to get through the paper – it’s actually being featured by the American Mathematical Society.
To be honest, I can’t process all of it in what we’ll call a “timely fashion.” But Austin has augmented the equations with about 3,500 words of text, and those I can handle. He starts with the basics, writing, “Google’s PageRank algorithm assesses the importance of web pages without human evaluation of the content.”
Austin goes on to describe the “monthly popularity contest among all pages on the web to decide which pages are most important,” and here’s where matrices (not the automobiles, sadly) and all other manner of math tools come into play.
After working through several steps and arriving at an intermediate solution, the paper drops a disclaimer: “there can be no absolute measure of a page’s importance, only relative measures for comparing the importance of two pages through statements such as ‘Page A is twice as important as Page B.'”
In another blessedly understandable snippet of English, Austin concludes that we can “interpret a web page’s PageRank as the fraction of time that a random surfer spends on that web page.”
The mathematician is even kind enough to include an explanation. “This may make sense if you have ever surfed around for information about a topic you were unfamiliar with: if you follow links for a while, you find yourself coming back to some pages more often than others. Just as ‘All roads lead to Rome,’ these are typically more important pages.”
To get into the details of the article, however, you’ll need to set aside a couple of hours and possibly call the American Mathematical Society for help.
—
Tag:
Add to Del.icio.us | Digg | Reddit | Furl
Doug is a staff writer for Murdok. Visit Murdok for the latest eBusiness news.