Blogspot.com and Googlebot

Q: Just how quickly does Googlebot visit a page after it is linked to from a post on blogspot.com?
A: Pretty darn quickly.

I am curious about just what data is being passed to pages by user-agents. To try to gather some of this data, I created a test page at Aleph Studios. This page records the keys and values of the $_GET, $_POST, and $_COOKIE arrays, along with the values that belong to the keys starting with HTTP_ from the $_SERVER array, and the value of $_SERVER[‘REQUEST_TIME’]. It bundles all this up into an email and sends it to me each time the page is accessed.

In order to get this page into Google, I posted a link to it on my throwaway blog at Aleph Studios on Blogger. The timestamp of that post is 9:13 pm. The email from the page as triggered by Googlebot is timestamped 9:14 pm. That’s pretty slick.

The HTTP headers sent by Googlebot and recorded by the page are:

HTTP_ACCEPT = */*
HTTP_ACCEPT_ENCODING = gzip,deflate
HTTP_CONNECTION = Keep-alive
HTTP_FROM = googlebot(at)googlebot.com
HTTP_HOST = alephstudios.com
HTTP_USER_AGENT = Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

The page’s filename, title tag, and H1 tag are all “WvVdWfwgMcmypDqv7t”, so it’ll be easy to find in Google later. As I’ve observed in the past, within 20 minutes, the page on blogspot.com containing the string shows up in Google for a search on the string.

New pages don’t really take twenty minutes to show up in Google, but I don’t check Google very quickly after hitting the Publish button. The fastest I’ve personally witnessed a new page on ardamis.com appearing in Google was under 4 minutes.