Friday, September 20, 2024

GoogleBot the “Spider of Doom”

A funny story is circulating in tech circles about how Googlebot inadvertently destroyed the database of a content management system (CMS) based site that took months of work to build.

As the story goes, a web development firm was given a contract to rebuild an existing site using a CMS. As the client already had a site with a significant amount of content, they took it slow and fully populated the site with all the content from the previous site. When they had finally uploaded everything, they took the site live.

“Things went pretty well for a few days after going live. But, on day six, things went not-so-well: all of the content on the website had completely vanished and all pages led to the default “please enter content” page. Whoops.”

After painstaking investigation, Googlebot, the spider Google uses to find information on the web, was found to be the cause.

When one of the users entered information to the CMS (using copy and paste), he or she included an EDIT hyperlink that was left in a multi-user document. As a human error, this wouldn’t normally be a problem because users are required to log-in with a password before they can make changes.

“But, the CMS authentication subsystem didn’t take into account the sophisticated hacking techniques of Google’s spider. As it turns out, Google’s spider doesn’t use cookies, which means that it can easily bypass a check for the “isLoggedOn” cookie to be “false”. It also doesn’t pay attention to Javascript, which would normally prompt and redirect users who are not logged on. It does, however, follow every hyperlink on every page it finds, including those with “Delete Page” in the title.”

In short, Googlebot muscled its way into the CMS and followed the edit link. The rest was history, or at least that’s what became of months of work. Fortunately, a recent backup of the full site was available for uploading.

Add to document.write(“Del.icio.us”) | Digg | Yahoo! My Web

Technorati:

Jim Hedger is the SEO Manager of StepForth Search Engine Placement Inc. Based in Victoria, BC, Canada, StepForth is the result of the consolidation of BraveArt Website Management, Promotion Experts, and Phoenix Creative Works, and has provided professional search engine placement and management services since 1997. http://www.stepforth.com/ Tel – 250-385-1190 Toll Free – 877-385-5526 Fax – 250-385-1198

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles