Monthly Archives: January 2007

I was tired of seeing the majority of my posts’ comments feeds show up in Google’s Supplemental Index, so I changed all the individual posts’ comments RSS links to rel=”nofollow”. This should at least cause Googlebot to stop passing PageRank through those links, but what I really want is for Googlebot to stop spidering the individual posts’ comment feeds, in hopes that they’ll eventually be removed from the index. To see only those pages of a site that are in the Supplemental Index, use this neat little search feature: site:DOMAIN.com *** -view. For example, to see which pages of Ardamis.com are in the SI, I’d search for: site:ardamis.com *** -view. This is much easier than the old way of scanning all of the indexed pages and picking them out by hand.

To change all the individual posts’ comments feed links to rel=”nofollow”, open ‘wp-includesfeed-functions.php’ and add rel=”nofollow” to line 84 (in WordPress version 2.0.6), as so:

echo "<a href="$url" rel="nofollow">$link_text</a>";

One could use the robots.txt file to disallow Googlebot from all /feed/ directories, but this would also restrict it from the general site’s feed and the all-inclusive /comments/feed/, and I’d like the both of these feeds to continue to be spidered. Another, minor consequence of using robots.txt to restrict Googlebot is that Google Sitemaps will warn you of “URLs restricted by robots.txt”.

To deny all spiders from any /feed/ directory, add the following to your robots.txt file:

User-agent:*
Disallow: /feed/

To deny just Googlebot from any /feed/ directory, use:

User-agent: Googlebot
Disallow: /feed/

For whatever reason, the whole-site comments feed at //ardamis.com/comments/feed/ does not appear among my indexed pages, while the nearly empty individual post feeds are indexed. Also, the general site feed at //ardamis.com/feed/ is in the Supplemental Index. It’s a mystery to me why.

I’ll occasionally return to a post and revise it for improved methodology, test results or whatever. But the ‘posted on’ date always remains the same, even after the post has been updated. I feel that displaying only the ‘posted on’ date could be somewhat confusing, particularly when I state that I’ve updated the post in a comment dated months later. So, in the interest of full disclosure, I have added a few lines of code to the WordPress ‘single.php’ template file to supplement each post’s meta data with the date it was last modified. If the post has never been modified or if it was last modified within 24 hours of the ‘posted on’ date, only the ‘posted on’ date is shown.

Of course, this could be used anywhere inside the WordPress loop, not just in the meta data section. The code I use to show a WordPress post’s last modified date and time is as follows.

The default Kubrick template’s meta data section:

<p class="postmetadata alt">
<small>
This entry was posted 
...
on <?php the_time('l, F jS, Y') ?> 
at <?php the_time() ?>
and is filed under <?php the_category(', ') ?>.
You can follow any responses to 
this entry through the 
<?php comments_rss_link('RSS 2.0'); ?> feed. 

The new code, modified to selectively display the last modified date:

<p class="postmetadata alt">
<small>
This entry was posted
...
on <?php the_time('F jS, Y') ?> 
at <?php the_time() ?>
						
<?php $u_time = get_the_time('U'); 
$u_modified_time = get_the_modified_time('U'); 
if ($u_modified_time >= $u_time + 86400) { 
echo "and last modified on "; 
the_modified_time('F jS, Y'); 
echo " at "; 
the_modified_time(); 
echo ", "; } ?>
	
and is filed under <?php the_category(', ') ?>.
You can follow any responses to 
this entry through the 
<?php comments_rss_link('RSS 2.0'); ?> feed. 

You can see how this works in the meta data section of this post.

Further customization

I’m using a grace period of 24 hours from the time the post was published, but you could change this by replacing 86400 with however much time you want, specified in seconds.