I was tired of seeing the majority of my posts’ comments feeds show up in Google’s Supplemental Index, so I changed all the individual posts’ comments RSS links to rel=”nofollow”. This should at least cause Googlebot to stop passing PageRank through those links, but what I really want is for Googlebot to stop spidering the individual posts’ comment feeds, in hopes that they’ll eventually be removed from the index. To see only those pages of a site that are in the Supplemental Index, use this neat little search feature:
site:DOMAIN.com *** -view. For example, to see which pages of Ardamis.com are in the SI, I’d search for:
site:ardamis.com *** -view. This is much easier than the old way of scanning all of the indexed pages and picking them out by hand.
To change all the individual posts’ comments feed links to rel=”nofollow”, open ‘wp-includesfeed-functions.php’ and add rel=”nofollow” to line 84 (in WordPress version 2.0.6), as so:
echo "<a href="$url" rel="nofollow">$link_text</a>";
One could use the robots.txt file to disallow Googlebot from all /feed/ directories, but this would also restrict it from the general site’s feed and the all-inclusive /comments/feed/, and I’d like the both of these feeds to continue to be spidered. Another, minor consequence of using robots.txt to restrict Googlebot is that Google Sitemaps will warn you of “URLs restricted by robots.txt”.
To deny all spiders from any /feed/ directory, add the following to your robots.txt file:
User-agent:* Disallow: /feed/
To deny just Googlebot from any /feed/ directory, use:
User-agent: Googlebot Disallow: /feed/
For whatever reason, the whole-site comments feed at http://www.ardamis.com/comments/feed/ does not appear among my indexed pages, while the nearly empty individual post feeds are indexed. Also, the general site feed at http://www.ardamis.com/feed/ is in the Supplemental Index. It’s a mystery to me why.