When the original PageRank algorithm was conceived, it was built around the a mathematical formalization called random walk or RW.
Considering a normal surfer to be a random walker, it was assumed that an individual would browse the Internet and randomly visit one of the pages followed by more pages through the hyperlinks on each of the landing pages. When the process ends the visitor would have surfed all the pages.
Out of all the pages, the most visited pages would get the highest rank as they are the ones with the maximum number of incoming links.
The problems arise because a normal surfer doesn’t follow the random walk formulation, hence the assumptions of the PageRank might not apply to a surfer in the real world.
There has been an intriguing post written by Bill Slawski which highlights, how the assumptions of the PageRank are flawed and how the recent patent for the “User-sensitive pagerank” addresses some of the assumptions of the original PageRank.
The flaws highlighted center around some of the assumptions made by the PageRank.
All the links are created equal: The problem here is that the surfers just don’t hit on hyperlinks at random, we read the hyperlink, mentally calculate it’s value and then click on it. If this is so the How come all the links are equal?
Bored Surfers Go to Random Pages: True, that we sometimes get bored with a page and move on to another. But, the next page is not chosen at random, rather it’s carefully chosen.
Bored Surfers Only Go to Trusted Pages: While it’s true that bored surfers don’t go to random pages, rather think and then move on, but the page they choose might not necessarily be a trusted one.
Pages Change and Lose Value at Same Rates: The pages can lose value during their lifetime but can they at the same rate. There are a myriad factors governing each pages’ popularity and each one changes in value differently.
PageRank Calculations are reliable: This assumption talks about the ‘blocked’ PageRank,” however the patents application suggests that these aggregations are not perfect.
The User-sensitive pagerank patent application would include various aspects of user behavior to calculate a better PageRank. These aspects pertain to Link Weight; Likelihood of Randomly Leaving to a New Page and Satisfaction with Found Pages.
This patent again underscores the need to design websites which intrigue the visitors to spend a longer time at the website and explore it. However, other aspects such as the usability is important too. There have been many discussions about the PageRank and many people seem to think that there is a need to replace the PageRank with something superior, it seems that we have it already.