Tag Archives: WordPress

In August, 2010, I described a simple method for dramatically reducing the number of spam comments that are submitted to a WordPress blog. The spam comments are rejected before they are checked by Akismet, so they never make it into the database at all.

Now, a few months later, I’m posting a screenshot of the Akismet stats graph from the WordPress dashboard showing the number of spam comments identified by Akismet before and after the system was implemented.

Akismet stats for August - December, 2010

The spike in spam comments detected around November 3rd occurred after an update to WordPress overwrote my altered wp-comments.php file. I replaced the file and the spam dropped back down to single digits per day.

As of early October, Ardamis.com has its Google sitelinks back. I first noticed them back in July of 2009, when Ardamis had a toolbar PageRank of 6. Changes to Google’s algorithm later cost the site the sitelinks and reduced the PR to 5, which is how the site has appeared for the last year or so. Three months ago, in July of 2010 and one year after the sitelinks appeared, I noticed that all of the pages combined had over one million inbound links.

This is what a Google search for ardamis returns:

Ardamis' Google sitelinks

Ardamis' Google sitelinks

The second result returned, Final Fantasy XIII freezing on Xbox 360, is among my longest posts, has 91 comments, and enjoys some of the best inbound links of any page on the site, including from the forums at Xbox.com, Kotaku and GamesRadar.

The third result is my primary competition for the term ardamis, which briefly held the number one ranking a few months ago. That site has some one-line site links.

The actual mechanics of obtaining sitelinks remains a mystery, but there are plenty of people who are willing to speculate (and a few brave enough to promise they can deliver them for a price).

I’ve been posting more frequently, the site uses the WP-Paginate plugin and according to Google’s Webmaster Tools, the home page alone now has well over one million inbound links, but otherwise it’s been business as usual here.

Ardamis home page one million links

Over one million inbound links to the home page of Ardamis.com

I’m not going to speculate about how to get sitelinks or whether one or more of the changes in the last year was the catalyst, but Google does say to use descriptive and non-repetitive anchor and alt text in a site’s internal links and to keep important pages within a few clicks of the home page. These are very basic, fundamental things that any site should do, but it bears repeating.

I’ve written a number of posts on ways to reduce the number of spam comments a blog receives. In this post, I’ll revisit an old method that has almost completely stopped spam comments at ardamis.com before they get to the database.

My first system for blocking WordPress comment spam was an overly complex combination of JavaScript and a challenge-response to test that the comment was being submitted by a person. The value of the action attribute in the form was not in the HTML when the page was loaded, so the form couldn’t be immediately submitted, then JavaScript was used to write the path to a renamed wp-comments-post.php file only after a certain user action was performed. I was never really satisfied with it. I didn’t like relying on JavaScript, I had doubts that any human being (meaning of any mental or physical capacity, speaking any language, etc.) could correctly answer the question, and I was concerned that any obstacle to submitting a form discourages legitimate commenting.

A few months later, I posted a simpler timestamp method for reducing WordPress comment spam that compares two timestamps and then rejects any form submission that occurrs within 60 seconds of the post page being loaded. The visitor wasn’t bothered by an additional form field solely for anti-spam and there was no JavaScript involved.

Both methods were very effective at blocking spam before it made it to the database. In the five months leading up to the implementation of the first method, Akismet was catching an average of 1418 spam comments per month. In the first five months after these methods were put in place, Akismet was catching only 54 spam comments per month. But I also noticed a reduction in legitimate comments, from an average of 26 per month to 20 per month, which led me to suspect that real visitors attempting to leave comments were being discouraged from doing so.

The timestamp method required changing a core file, which was overwritten each time WordPress was updated. As time went on, I forgot to replace the file after upgrading WordPress, so the protection was lost and I once again had only Akismet blocking spam. A few months later, while doing work on the database in an attempt to speed up WordPress, I happened to check my historical stats and found that Akismet had detected 4,144 comments in July, 2010. Yikes. It was time to revisit these old methods.

At 2:30 AM on August 1, 2010, I again implemented my timestamp method, but this time I also renamed the wp-comments-post.php file that processes the form. I changed my theme’s comments.php file to submit the form to the new page, deleted the wp-comments-post.php file from the server and tested to make sure that comments could still be submitted. And then I waited to see what would happen.

The effect was pretty amazing. The spam had almost completely stopped.

My Akismet stats look like this:

Date Spam
7.30.10 192
7.31.10 196
8.1.10 32
8.2.10 0
8.5.10 4
8.8.10 4
8.10.10 4
8.11.10 4
8.13.10 0
8.14.10 0

(I don’t know why so many dates in August are skipped in the log, but whatever.)

Fast, but only partial protection

The quick and easy way to reduce the number of spam comments that your WordPress blog receives is to merely change the location of the comment form processing script.

  1. Rename wp-comments-post.php to anything else. I like using a string of random hexadecimal characters, like: z1t0zVGuaCZEi.php.
  2. Edit your current theme’s comments.php so that the form is submitted to this new file.
  3. Upload these files to their respective directories, then delete the wp-comments-post.php file from your server.

This method works well to stop spam submitted by bots that assume the comment form processing script used by WordPress is always at the same location. More advanced bots will read the actual location of the file from the action attribute of the form element, but that can be countered by using either the JavaScript or timestamp method.

Access log analysis

To illustrate the effectiveness of the renamed wp-comments-post file + timestamp check, below are some events from my 06 August 2010 access log.

Bot defeated by renamed file alone

Here is a form submission to the non-existent wp-comments-post file that occurs 2 seconds after the post page is requested.

173.242.112.44 - - [06/Aug/2010:23:21:37 -0700] "GET www.ardamis.com/2007/07/12/defeating-contact-form-spam/ HTTP/1.0" 200 32530 "http://www.google.com" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50"
173.242.112.44 - - [06/Aug/2010:23:21:39 -0700] "POST www.ardamis.com/wp-comments-post.php HTTP/1.0" 404 15529 "//ardamis.com/2007/07/12/defeating-contact-form-spam/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50"

The bot is sent a 404 HTTP status code, which is widely understood to mean that the page isn’t there and you can stop asking for it. But that doesn’t stop this bot! Two minutes later, it’s back at another page, trying again.

173.242.112.44 - - [06/Aug/2010:23:23:01 -0700] "GET www.ardamis.com/2007/03/29/xbox-360-gamercard-wordpress-plugin/ HTTP/1.0" 200 101259 "http://www.google.com" "Opera/9.64(Windows NT 5.1; U; en) Presto/2.1.1"
173.242.112.44 - - [06/Aug/2010:23:23:05 -0700] "POST www.ardamis.com/wp-comments-post.php HTTP/1.0" 404 15529 "//ardamis.com/2007/03/29/xbox-360-gamercard-wordpress-plugin/" "Opera/9.64(Windows NT 5.1; U; en) Presto/2.1.1"

Again, it gets a 404 back. Some bots never learn.

Bot defeated by timestamp check

Here is a form submission to the renamed wp-comments-post file that occurs 4 seconds after the post page is requested.

91.201.66.6 - - [06/Aug/2010:23:30:41 -0700] "GET www.ardamis.com/2007/03/29/xbox-360-gamercard-wordpress-plugin/ HTTP/1.1" 200 21787 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)"
91.201.66.6 - - [06/Aug/2010:23:30:45 -0700] "POST www.ardamis.com/wp-comments-post-timestamp-3.0.1.php HTTP/1.1" 500 1227 "//ardamis.com/2007/03/29/xbox-360-gamercard-wordpress-plugin/" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)"

The 500 HTTP header indicates that this submission was denied and the comment never made it to the database. This access log doesn’t indicate which check stopped the POST (eg: the email validation or the timestamp function), but my money is on the timestamp.

Here’s another form submission to the renamed wp-comments-post file that occurs one second after the post page is requested. Speed reader or bot?

95.220.185.210 - - [06/Aug/2010:23:56:54 -0700] "GET www.ardamis.com/2010/02/26/fixing-word-2007-add-in-issues/ HTTP/1.1" 200 23977 "-" "Opera/9.01 (Windows NT 5.0; U; en)"
95.220.185.210 - - [06/Aug/2010:23:56:55 -0700] "POST www.ardamis.com/wp-comments-post-timestamp-3.0.1.php HTTP/1.1" 500 1213 "//ardamis.com/2010/02/26/fixing-word-2007-add-in-issues/" "Opera/9.01 (Windows NT 5.0; U; en)"

The submission is rejected.

Taking the method even further

To take this method even further, one could send a 200 OK header even when the comment is blocked, so the bots never know their mission failed. But this seems unnecessary at this point, as it doesn’t appear that they change their behavior after being sent a 404 error, or that they try again after being sent a 500 error. It also makes it harder to figure out from the access logs which comments were rejected and for what reason.

If you still want to do this, first implement the timestamp method, then make the following modifications.

Sending a 200 header

$comment_timestamp    = trim($_POST['timestamp']);
$submitted_timestamp  = time();

if ( $comment_timestamp == '' ) {
// If the value for $_POST['timestamp'] is an empty string, exit (the form wasn't submitted by the theme's comments.php)
	header('HTTP/1.1 200 OK');
	echo '<p style="text-align:center;">Error: It looks like this form was not submitted by the form at ' . get_option('siteurl') . '.</p>';
	exit;
}
if ( $submitted_timestamp - $comment_timestamp < 60 ) {
// If the form was submitted within 60 seconds of page load, exit
	header('HTTP/1.1 200 OK');
	echo '<p style="text-align:center;">Error: The comment was posted too soon after the page was loaded.  Please press the Back button on your browser and try again in a few seconds.</p>'; 
	exit;
}
// If the form was submitted more than 10 minutes after page load, die
if ( $submitted_timestamp - $comment_timestamp > 600 ) {
	header('HTTP/1.1 200 OK');
	echo '<p style="text-align:center;">Error: You waited too long before posting a comment.</p>';
	exit;
}

One could also write a record to a database each time the old wp-comments-post.php file is requested or any of the timestamp checks block a form submission, and pretty quickly generate a list of IP addresses for a black list. At the same time, one could log which timestamp check caught the spam attempt, which is interesting enough that I’ll probably do it eventually.

I’ve written a few posts on how to speed up web sites by sending the correct headers to leverage browser caching and compressing .php, .css and .js files without mod_gzip or mod_deflate.

The intended audience for this post is developers who have already applied most or all of Google’s Web Performance Best Practices and Yahoo’s Best Practices for Speeding Up Your Web Site and are wondering how to speed up WordPress sites in particular.

I assume you’re familiar with WordPress caching and are already using a caching plugin, such as WP Super Cache, W3 Total Cache or the like.

Reduce HTTP requests

Reducing HTTP requests should be the very first thing step in speeding up any site. If you are using plugins, watch them carefully for inefficiencies like added CSS and JavaScript files. Combine, minify and compress these files. Some plugins allow you to turn off their bundled CSS in the plugin’s settings page. Where possible, copy the plugin’s CSS into the current theme’s style.css and turn off the extra file.

Delete deactivated plugins

Remove any plugins you’re not using. Deactivated plugins can be deleted from the Plugins page.

Speed up the mod_rewrite code

jdMorgan from WebmasterWorld.com has written a replacement for the .htaccess rewrite rule used by WordPress. This will speed up the WordPress mod_rewrite code by a factor of more than two.

This is a total replacement for the code supplied with WP as bounded by the “Begin WP” and “End WP” comments, and fixes several performance-affecting problems. Notably, the unnecessary and potentially-problematic container is completely removed, and code is added and re-structured to both prevent completely-unnecessary file- and directory- exists checks and to reduce the number of necessary -exists checks to one-half the original count (due to the way mod_rewrite behaves recursively in .htaccess context).

http://www.webmasterworld.com/apache/4053973.htm

The following code is adapted from the original to add png, flv and swf to the list of static file formats:

# BEGIN WordPress
# adapted from http://www.webmasterworld.com/apache/4053973.htm
#
RewriteEngine on
#
# Unless you have set a different RewriteBase preceding this point,
# you may delete or comment-out the following RewriteBase directive
# RewriteBase /
#
# if this request is for "/" or has already been rewritten to WP
RewriteCond $1 ^(index\.php)?$ [OR]
# or if request is for image, css, or js file
RewriteCond $1 \.(gif|jpg|png|ico|css|js|flv|swf)$ [NC,OR]
# or if URL resolves to existing file
RewriteCond %{REQUEST_FILENAME} -f [OR]
# or if URL resolves to existing directory
RewriteCond %{REQUEST_FILENAME} -d
# then skip the rewrite to WP
RewriteRule ^(.*)$ - [S=1]
# else rewrite the request to WP
RewriteRule . /index.php [L]
#
# END wordpress 

Only load the comment-reply.js when needed

In the default WordPress template, the comment-reply.js script is included on all single post pages, regardless of whether nested/threaded comments is enabled. A simple tweak changes the theme to only include comment-reply.js on single post pages only when it makes sense to do so: if threaded comments are enabled, commenting on that post is allowed, and a comment already exists.

Remove the following line from your theme’s header.php:

<?php if ( is_singular() ) wp_enqueue_script( 'comment-reply' ); ?>

Add the following lines to your theme’s functions.php.

// Don't add the wp-includes/js/comment-reply.js?ver=20090102 script to single post pages unless threaded comments are enabled
// adapted from http://bigredtin.com/behind-the-websites/including-wordpress-comment-reply-js/
function theme_queue_js(){
  if (!is_admin()){
    if (is_singular() && (get_option('thread_comments') == 1) && comments_open() && have_comments())
      wp_enqueue_script('comment-reply');
  }
}
add_action('wp_print_scripts', 'theme_queue_js');

Only load the l10n.js when needed

In WordPress 3.1, a l10n.js script is loaded. It is “mostly used for scripts that send over localization data from PHP to the JS side using wp_localize_script().” Whether it’s safe to remove this file seems to be a matter of debate, but should you decide you want to remove it…

Add the following lines to your theme’s functions.php.

// Don't add the wp-includes/js/l10n.js?ver=20101110 script to non-admin pages
// adapted from http://wordpress.stackexchange.com/questions/5451/what-does-l10n-js-do-in-wordpress-3-1-and-how-do-i-remove-it
function remove_l10n_js(){
  if (!is_admin()){
    wp_deregister_script('l10n');
  }
}
add_action('wp_print_scripts', 'remove_l10n_js');

Replace unecessary code executions and database queries

WordPress saves settings specific to your blog in the database. These settings include what language your blog is written in, whether text is read left-to-right or vice versa, the path to the template directory, etc.

The default WordPress theme contains a number of database queries in order to figure out these things and build the correct page. The default theme needs this flexibility, but your theme does not. Joost de Valk recommends replacing these database queries with static text in your theme files in his post Speed up WordPress, and clean it up too!

An easy way to do this is to browse to your web site and then view the source code. Copy the content that won’t change from page to page and paste it into your theme, overwriting the PHP with the generated HTML.

For example, my theme’s header.php file contains this line:

<html xmlns="http://www.w3.org/1999/xhtml" <?php language_attributes(); ?>>

The source code of the rendered page displays this line:

<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US">

On my blog, this HTML output is never going to be anything different, so why make WordPress look these settings up in the database each time a page is loaded? This line is an excellent candidate for replacement. The footer.php file contains a handful more opportunities for replacement, but go through each of your theme’s files and look for references to the template directory and other data that won’t change as long as you’re using the theme. All told, I was able to replace 12 database queries with static HTML.

Joost also recommends checking for unnecessary or slow database queries, and installing a helpful debugging plugin, in his post on Optimizing WordPress database performance.

Clean out your MySQL database

Delete spam comments

From the Dashboard, click Comments, then click on Empty Spam.

Delete post revisions

If you don’t use post revisions, you may want to delete them from the wp_posts table. Back up your database, then run the following SQL query:

DELETE FROM wp_posts WHERE post_type = "revision";

Before: 683 records
After: 165 records

This does not delete the latest draft of unpublished posts. It’s a good idea to optimize the table after deleting the revisions.

You can stop WordPress from saving post revisions by adding the following code to wp-config.php:

define('WP_POST_REVISIONS', false );

Optimize your MySQL database

Optimizing your MyISAM tables is comparable to defragmenting your hard drive. It’s probably been a while since you’ve done that, too.

If you’re using phpMyAdmin, browse to your WordPress database. Under the Structure tab, at the bottom of the list of tables, click on the link “Check all”. In the “With selected” menu, choose “Optimize table”. (I would have recommended just optimizing tables that have overhead, but the wp_posts table can be optimized even when it doesn’t show any overhead.

For MyISAM tables, OPTIMIZE TABLE works as follows:

  • If the table has deleted or split rows, repair the table.
  • If the index pages are not sorted, sort them.
  • If the table’s statistics are not up to date (and the repair could not be accomplished by sorting the index), update them.

http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html

Flush the Buffer Early

When users request a page, it can take anywhere from 200 to 500ms for the backend server to stitch together the HTML page. During this time, the browser is idle as it waits for the data to arrive. In PHP you have the function flush(). It allows you to send your partially ready HTML response to the browser so that the browser can start fetching components while your backend is busy with the rest of the HTML page. The benefit is mainly seen on busy backends or light frontends.

http://developer.yahoo.com/performance/rules.html#flush

Add flush() between the closing tag and the opening tag in header.php.

</head>
<?php flush(); // http://developer.yahoo.com/performance/rules.html#flush ?>
<body>

OK, so this isn’t technically a WordPress-specific tweak, but it’s a good idea.

File compression is possible on Apache web hosts that do not have mod_gzip or mod_deflate enabled, and it’s easier than you might think.

A great explanation of why compression helps the web deliver a better user experience is at betterexplained.com.

Two authoritative articles on the subject are Google’s Performance Best Practices documentation | Enable compression and Yahoo’s Best Practices for Speeding Up Your Web Site | Gzip Components.

Compressing PHP files

If your Apache server does not have mod_gzip or mod_deflate enabled (Godaddy and JustHost shared hosting, for example), you can use PHP to compress pages on-the-fly. This is still preferable to sending uncompressed files to the browser, so don’t worry about the additional work the server has to do to compress the files at each request.

Option 1: PHP.INI using zlib.output_compression

The zlib extension can be used to transparently compress PHP pages on-the-fly if the browser sends an “Accept-Encoding: gzip” or “deflate” header. Compression with zlib.output_compression seems to be disabled on most hosts by default, but can be enabled with a custom php.ini file:

[PHP]

zlib.output_compression = On

Credit: http://php.net/manual/en/zlib.configuration.php

Check with your host for instructions on how to implement this, and whether you need a php.ini file in each directory.

Option 2: PHP using ob_gzhandler

If your host does not allow custom php.ini files, you can add the following line of code to the top of your PHP pages, above the DOCTYPE declaration or first line of output:

<?php if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); ?>

Credit: GoDaddy.com

For WordPress sites, this code would be added to the top of the theme’s header.php file.

According to php.net, using zlib.output_compression is preferred over ob_gzhandler().

For WordPress or other CMS sites, an advantage of zlib.output_compression over the ob_gzhandler method is that all of the .php pages served will be compressed, not just those that contain the global include (eg.: header.php, etc.).

Running both ob_gzhandler and zlib.output_compression at the same time will throw a warning, similar to:

Warning: ob_start() [ref.outcontrol]: output handler ‘ob_gzhandler’ conflicts with ‘zlib output compression’ in /home/path/public_html/ardamis.com/wp-content/themes/mytheme/header.php on line 7

Compressing CSS and JavaScript files

Because the on-the-fly methods above only work for PHP pages, you’ll need something else to compress CSS and JavaScript files. Furthermore, these files typically don’t change, so there isn’t a need to compress them at each request. A better method is to serve pre-compressed versions of these files. I’ll describe a few different ways to do this, but in both cases, you’ll need to add some lines to your .htaccess file to send user agents the gzipped versions if they support the encoding. This requires that Apache’s mod_rewrite be enabled (and I think it’s almost universally enabled).

Option 1: Compress locally and upload

CSS and JavaScript files can be gzipped on the workstation, then uploaded along with the uncompressed files. Use a utility like 7-Zip (quite possibly the best compression software around, and it’s free) to compress the CSS and JavaScript files using the gzip format (with extension *.gz), then upload them to your server.

For Windows users, here is a handy command to compress all the .css and .js files in the current directory and all sub directories (adjust the path to the 7-Zip executable, 7z.exe, as necessary):

for /R %i in (*.css *.js) do "C:\Program Files (x86)\7-Zip\7z.exe" a -tgzip "%i.gz" "%i" -mx9

Note that the above command is to be run from the command line. The batch file equivalent would be:

for /R %%i in (*.css *.js) do "C:\Program Files (x86)\7-Zip\7z.exe" a -tgzip "%%i.gz" "%%i" -mx9

Option 2: Compress on the server

If you have shell access, you can run a command to create a gzip copy of each CSS and JavaScript file on your site (or, if you are developing on Linux, you can run it locally):

find . -regex ".*\(css\|js\)$" -exec bash -c 'echo Compressing "{}" && gzip -c --best "{}" > "{}.gz"' \;

This may be a bit too technical for many people, but is also much more convenient. It is particularly useful when you need to compress a large number of files (as in the case of a WordPress installation with multiple plugins). Remember to run it every time you automatically update WordPress, your theme, or any plugins, so as to replace the gzip’d versions of any updated CSS and JavaScript files.

The .htaccess (for both options)

Add the following lines to your .htaccess file to identify the user agents that can accept the gzip encoded versions of these files.

<files *.js.gz>
  AddType "text/javascript" .gz
  AddEncoding gzip .gz
</files>
<files *.css.gz>
  AddType "text/css" .gz
  AddEncoding gzip .gz
</files>
RewriteEngine on
#Check to see if browser can accept gzip files.
ReWriteCond %{HTTP:accept-encoding} gzip
RewriteCond %{HTTP_USER_AGENT} !Safari
#make sure there's no trailing .gz on the url
ReWriteCond %{REQUEST_FILENAME} !^.+\.gz$
#check to see if a .gz version of the file exists.
RewriteCond %{REQUEST_FILENAME}.gz -f
#All conditions met so add .gz to URL filename (invisibly)
RewriteRule ^(.+) $1.gz [QSA,L]

Credit: opensourcetutor.com

I’m not sure it’s still necessary to exclude Safari.

For added benefit, minify the CSS and JavaScript files before gzipping them. Google’s excellent Page Speed Firefox/Firebug Add-on makes this very easy. Yahoo’s YUI Compressor is also quite good.

Verify that your content is being compressed

Use the nifty Web Page Content Compression Verification tool at http://www.whatsmyip.org/http_compression/ to confirm that your server is sending the compressed files.

Speed up page load times for returning visitors

Compression is only part of the story. In order to further speed page load times for your returning visitors, you will want to send the correct headers to leverage browser caching.

I’ve been collecting links to good WordPress templates for a long time. Because I’ve been asked a few times recently if I can recommend some templates, I’m putting them together here.

Huge, isolated, animated slideshow

I really like these themes for focusing almost exclusively on one’s portfolio. They feature a massive, animated slideshow that keeps the visitor’s attention on the images. In some cases, a shadow underneath the slideshow, combined with the animation, creates a 3-D effect to put the portfolio right in the visitor’s face.

Confined slideshow

These themes are a bit more modest than the first collection. They still accentuate a portfolio of images, but in a less flashy way. The slideshow exists within a defined banner area, so the look is a little more traditional. These templates would be a better fit for businesses.

Partial-banner image rotation

These themes use only part of the banner to display a rotating group of images, leaving an area for text next to the images. This allows a short message or tagline to get equal placement. These are the most conservative themes.

Creative themes

This last group of themes break with the conventions of the above themes and use full-screen-width or otherwise larger-than-normal images that rotate behind the other elements of the site. Undeniably attention-grabbing, but also slower to load.

Single page

These are single-page web sites. These types of pages are great as business cards or resumes, or as place holders while a larger site is developed.

The rest

I like these, too, but they don’t fit in any of the above categories.

Update 8.27.11: The method described in this post uses PHP to generate the timestamps. If your site is using a caching plugin, the timestamps in the HTML will be stale, and this method will not work. Please see my updated post at A cache-friendly method for reducing WordPress comment spam for a new method using JavaScript for sites that use page caching.

In this post, I’ll explain how to reduce the amount of comment spam your WordPress blog receives by using an unobtrusive ‘handshake’ between the two files necessary for a valid comment submission to take place. I’ve written a few different articles on reducing comment spam by means of a challenge response test that the visitor must complete before submitting a comment, but I’m now looking for ways to achieve the same results while keeping the anti-spam method invisible to the visitor.

I’m a big fan of Akismet, but I also want to block as much spam as possible before it is caught by Akisment in order to reduce the number of database entries.

One thing this method does not do is rename and hide the path to the form processing script, but it makes that technique obsolete, anyway.

In the timestamp handshake method, a first timestamp is generated and written as a hidden input field when the post page loads. When the comment is submitted, a second timestamp is generated by the comment-processing script and it and the page-load timestamp are saved as variables. If the page-load timestamp variable is blank, which should be the case if the spambot uses any other page to populate the comment, the script will die. The page-load timestamp is then subtracted from the comment-submission timestamp. If the comment was submitted less than 60 seconds after the post page was loaded, the script dies with a descriptive error message. Hopefully, this will separate the bots’ comments from those left by thoughtful human visitors who have taken the time to read your post. If a human visitor does happen to submit a comment within 60 seconds of the page loading, he or she can click his or her browser’s back button and try resubmitting the comment again in a few seconds.

One drawback is that this method does involve editing a core file – wp-comments-post.php. You’ll have to re-edit it each time you upgrade WordPress, which is a nuisance, I know. The good thing is that if you forget to do this, people can still comment – you just won’t have the anti-spam protection.

Note that the instructions in the following steps are based on the code in WordPress version 2.3 and the Kubrick theme included with that release. You may need to adjust for your version of WordPress.

Step 1 – Add the hidden timestamp input field to the comment form

Open the comments.php file in your current theme’s folder and find the following lines:

<p><textarea name="comment" id="comment" cols="100%" rows="10" tabindex="4"></textarea></p>

<p><input name="submit" type="submit" id="submit" tabindex="5" value="Submit Comment" />

Add the following line between them:

<p><input type="hidden" name="timestamp" id="timestamp" value="<?php echo time(); ?>" size="22" /></p>

Step 2 – Modify the wp-comments-post.php file to create the second timestamp and perform the comparison

Open wp-comments-post.php and find the lines:

$comment_author       = trim(strip_tags($_POST['author']));
$comment_author_email = trim($_POST['email']);
$comment_author_url   = trim($_POST['url']);
$comment_content      = trim($_POST['comment']);

Immediately after them, add the following lines:

$comment_timestamp    = trim($_POST['timestamp']);
$submitted_timestamp  = time();

if ( $comment_timestamp == '' )
	wp_die( __('Hello, spam bot!') );
	
if ( $submitted_timestamp - $comment_timestamp < 60 )
	wp_die( __('Error: you must wait at least 1 minute before posting a comment.') );

That’s it; you’re done.

Credits

Thanks to Jonathan Bailey for suggesting the handshake in his post at http://www.plagiarismtoday.com/2007/07/24/wordpress-and-comment-spam/.

Comment spam comes from humans who are paid to post it and robots/scripts that do it automatically. The majority of spam comes from the bots. There’s very little one can do to defend against a determined human being, but bots tend to behave predictably, and that allows us to develop countermeasures.

From my observations, it seems that the spambots are first given a keyword phrase to hunt down. They go through the search engine results for pages with that keyword phrase, and follow the link to each page. If the page happens to be a WordPress post, they pass their spammy content to the comment form. Apparently, they do this in a few different ways. Some bots seem to inject their content directly into the form processing agent, a file in the blog root named wp-comments-post.php, using the WordPress default form field IDs. Other bots seem to fill in any fields they come across before submitting the form. Still others seem to fill out the form, but ignore any unexpected text input fields. All of these behaviors can be used against the spammers.

One anti-spam technique that has been used for years is to rename the script that handles the form processing. If you look at the HTML of a WordPress post with comments enabled, you’ll see a line that reads:

<form action="http://yourdomain.com/wp-comments-post.php" method="post" id="commentform">

The ‘wp-comments-post.php’ file handles the comment form processing, and a good amount of spam can be avoided by simply renaming it. Many bots will try to pass their content to directly to that script, even if it no longer exists, to no effect.

More sophisticated bots will look through the HTML for the URL of the form processing script, and in doing so will learn the URL of the newly renamed script and pass their contents to that script instead. The trick is to prevent these smarter bots from discovering the URL of the new script. Because it seems that the bots aren’t looking through external JavaScript files yet, that’s where we will hide the URL. (If you do use this technique, it would be very considerate to tell your visitors that JavaScript is required to post comments.)

Step 1

Rename the wp-comments-post.php file to anything else. Using a string of random hexadecimal characters would be ideal. Once you’ve renamed the file, enter the address of the file in your browser. The page should be blank; if you get a 404 error, something is wrong. Make a note of that address, because you’ll need it later. Verify that the wp-comments-post.php file is empty or is no longer on your server.

(Because I was curious about how many bots were hitting the wp-comments-post.php file directly, I replaced the code with a hitcounter. Sure enough, bots are still hitting the file directly, even though there is no longer any path leading to it.)

Step 2

Open up the ‘comments.php’ file in your theme directory. Find the line:

<form action="http://yourdomain.com/wp-comments-post.php" method="post" id="commentform">

and change the value of the action attribute to a number sign (or pound sign, or hash), like so:

<form action="#" method="post" id="commentform">

Any bots that come to the page and search for the path to your comment processing script will just see the hash, so they will never discover the URL to the real script. This change also means that if a bot or a visitor tries to submit the form, the form will fail, because a WordPress single post page isn’t designed to process forms. We want the bots to fail, but we’ll need to put things right for humans.

If you are tempted to designate a separate page for the action value, note that the only people likely to ever see this page are visitors without JavaScript enabled who fill out the form.

Step 3

Create a new JavaScript file with the following code.

function commentScriptReveal() {
	// enter the URL of your renamed wp-comments-post.php file below
	var scriptPath = "http://yourdomain.com/renamed-wp-comments-post.php";
	document.getElementById("commentform").setAttribute("action", scriptPath);
}

Enter the address of your renamed file as the value for the variable scriptPath. The function commentScriptReveal, when called, will find the element with the ID ‘commentform’ (that’s the comment form) and change its action attribute to the URL of the renamed file, allowing the form to be successfully sent to the processing agent.

Save the file as ‘commentrevealer.js’ and upload it to the /scripts/ directory in your blog’s root. Add the script to your theme’s header.php file:

<script src="<?php echo bloginfo('url'); ?>/scripts/commentrevealer.js" type="text/javascript"></script>

Now we just need to decide how to call the commentScriptReveal function.

Step 4

The ideal method of calling the function would be the one where the human visitor always calls the function, and the bot never calls it. To do this, we need to know something about how the bots work.

Step 4a — For spam bots that ignore unexpected text input fields:

If the bots ignore unexpected text input fields, we can simply add a field, label it ‘required’, and attach the script revealer to that field with one of the following event handlers:

  • onchange triggered when the user changes the content of a field
  • onkeypress triggered when a keyboard key is pressed or held down
  • onkeydown triggered when a keyboard key is pressed
  • onkeyup triggered when a keyboard key is released
  • onfocus triggered when an element gets focus
  • onblur triggered when an element loses focus

I’m intentionally vague about how to trigger the function commentScriptReveal because this technique will be efficacious longer if different people use different events. Furthermore, the text input field doesn’t necessarily need to do anything, its contents will just be discarded when the form is processed. In fact, it doesn’t even need to be a text input field. It can be any form control—a button, a checkbox, a radio button, a menu, etc. We just need human visitors to interact with it somehow. Those bots that skip over the control won’t trigger the revealer event, and your visitors (who always follow directions) will.

If everyone goes about implementing this method in a slightly different way, the spammers should find it much more difficult to counter.

For further reading on JavaScript events: QuirksMode – Javascript – Introduction to Events.

Step 4b — For spam bots that add text to every input field they come across:

If the bots are hitting every text input field with some text, follow Step 4a, and then create a second JavaScript file, named ‘commentconcealer.js’, with the following code:

function commentScriptConceal() {
	// enter the URL of your renamed wp-comments-post.php file below
	var scriptPath = "#";
	document.getElementById("commentform").setAttribute("action", scriptPath);
}

The function commentScriptConceal re-rewrites the action attribute back to “#“.

Upload the file and add the script to your theme’s header.php file:

<script src="<?php echo bloginfo('url'); ?>/scripts/commentconcealer.js" type="text/javascript"></script>

Add another text input field somewhere below the one you added in Step 4a. Hide this field from visitors with {display: none;}. Call the function with an onfocus (or onblur, etc.) event on the second input field:

<p style="display: none;"><input type="text" name="reconceal" id="reconceal" value="" size="22" onfocus="return commentScriptConceal()" />
<label for="reconceal"><small>OMG Don't Touch This Field!?!?</small></label></p>

The legitimate visitors will never trigger this, but any bot that interacts with every field on a page will.

Step 5

If you chose to use a text input field in Step 4a, consider making that a challenge-response test. It’s always a good idea to use a server-side check to back up a client-side check. I like to ask the question “What color is an orange?” as a tip of the hat to Eric Meyer. This challenge-response test can be integrated into wp-comments-post.php, so that if the user fails the test, the form submission dies in the same way it would if a required field were left blank.

For this example, let’s ask the question “What color is Kermit the Frog?” The answer, of course, is green.

Open your wp-comments-post.php file, and (in WP version 2.2.2) somewhere around line 31 add:

$comment_verify = strtolower(trim($_POST['verify']));

This will trim any whitespace from the answer to the challenge question, and convert the characters to lowercase.

Around line 45, find the lines:

} else {
	if ( get_option('comment_registration') )
		wp_die( __('Sorry, you must be logged in to post a comment.') );
}

And add the following lines to them, as so:

} else {
	if ( get_option('comment_registration') )
		wp_die( __('Sorry, you must be logged in to post a comment.') );
	if ( $comment_verify != 'green' )
		wp_die( __('Sorry, you must correctly answer the "Kermit" question to post a comment.') );
}

This will check the visitor’s answer against the correct answer, and if they don’t match, the script will stop and the comment won’t be submitted. The visitor can hit the back button to change the answer.

Now we need to add the challenge question to the theme file comments.php. Find the lines:

<p><input type="text" name="url" id="url" value="<?php echo $comment_author_url; ?>" size="22" tabindex="3" />
<label for="url"><small>Website</small></label></p>

And right below them, add the following lines:

<p><input type="text" name="verify" id="verify" value="" size="22" tabindex="4" />
<label for="verify"><small>What color is Kermit the Frog? (anti-spam) (required)</small></label></p>

(You may need to update all the tabindex numbers after you add this field.)

That’s it, your visitors will now have to correctly answer the Kermit the Frog question to submit the form.

Further customization

The methods described here just scratch the surface of what can be done to obfuscate the comment handling script. For example, the URL could be broken up into parts and saved as multiple variables. It could have a name comprised of numbers, and the commentScriptReveal JavaScript could perform math to assemble it.

I can’t imagine I’m the first person to come up with this, but… the user could be required to complete a challenge-response question, the correct answer to which would be used as part of the name of the script—the URL to the script doesn’t exist anywhere until the visitor creates it.

But, there are some people who don’t want to impose even a minor inconvenience on their visitors. If you don’t like challenge-response tests, what about using JavaScript to invisibly check the screen resolution of the user agent? And we haven’t even considered using cookies yet.

Credits

Many thanks to Jeff Barr for demonstrating how to put a challenge-response test into the wp-comments-post.php file in his post WordPress Comment Verification (With Source Code). I had been using only JavaScript for validation up until that point, shame on me.

Many thanks also to Will Bontrager for writing a provocative explanation of how to temporarily hide the form processing script using JavaScript in Spamming You Through Your Own Forms.

Huge thanks to Eric Meyer, who wrote a pre-plugin, challenge-response test script called WP-Gatekeeper for a very early version of WordPress.

I’ve described two similar methods for defeating contact form spam by hiding the webmail script in an eariler post.

This plugin is once again actively supported. Please download the latest version from http://wordpress.org/extend/plugins/sociable/.

I’ve fixed some errors that I was experiencing with version 2.0 (dated 2007-02-02) of the Sociable WordPress plugin by Peter Harkins. Specifically, when running WordPress 2.2+, I would get the following warnings when saving changes:

Warning: implode() [function.implode]: Bad arguments. in PATH\wp-content\plugins\sociable\sociable.php on line 651

Warning: Invalid argument supplied for foreach() in PATH\wp-content\plugins\sociable\sociable.php on line 762

For each button, I’d get another error:

Warning: in_array() [function.in-array]: Wrong datatype for second argument in PATH\wp-content\plugins\sociable\sociable.php on line 797

I think I’ve narrowed the cause down to the way the plugin was writing the newly selected options to the database.

In addition to fixing those warnings, I also corrected a few behaviors:

  1. The StumbleUpon button didn’t send the URL to the correct address at http://www.stumbleupon.com/. The button now works correctly, and also sends the page’s title.
  2. In Options -> Sociable, leaving the field for “text displayed in front of the icons” blank now results in no extraneous code being inserted into the page. Leaving the field blank also eliminates the popup “These icons link to social bookmarking sites…” text.
  3. The links to the social networking sites are now all rel="nofollow".

The WordPress function wp_get_archives(‘type=postbypost’) displays a lovely list of posts, but won’t show the date of each post. This plugin adds each post’s date to those ‘postbypost’ lists, like so:

Add dates to wp_get_archives

Add dates to wp_get_archives

Usage

  1. Upload and activate the plugin
  2. Edit your theme, replacing wp_get_archives('type=postbypost') with if (function_exists('ard_get_archives')) ard_get_archives();

The function ard_get_archives(); replaces wp_get_archives('type=postbypost'), meaning you don’t need to specify type=postbypost. You can use all of the wp_get_archives() parameters except ‘type’ and ‘show_post_count’ (limit, format, before, and after). In addition, there’s a new parameter: show_post_date, that you can use to hide the date, but the plugin will show the date by default.

show_post_date
(boolean) Display date of posts in an archive (1 – true) or do not (0 – false). For use with ard_get_archives(). Defaults to 1 (true).

Customizing the date

By default, the plugin displays the date as “(MM/DD/YYYY)”, but you can change this to use any standard PHP date characters by editing the plugin at the line:

$arc_date = date('m/d/Y', strtotime($arcresult->post_date));  // new

The date is wrapped in tags, so you can style the date independently of the link.

How does it work?

The plugin replaces the ‘postbypost’ part of the function wp_get_archives, and adds the date to $before. The relevant code is below. You can compare it to the corresponding lines in general-template.php.

	} elseif ( ( 'postbypost' == $type ) || ('alpha' == $type) ) {
		('alpha' == $type) ? $orderby = "post_title ASC " : $orderby = "post_date DESC ";
		$arcresults = $wpdb->get_results("SELECT * FROM $wpdb->posts $join $where ORDER BY $orderby $limit");
		if ( $arcresults ) {
			$beforebefore = $before;  // new
			foreach ( $arcresults as $arcresult ) {
				if ( $arcresult->post_date != '0000-00-00 00:00:00' ) {
					$url  = get_permalink($arcresult);
					$arc_title = $arcresult->post_title;
					$arc_date = date('m/d/Y', strtotime($arcresult->post_date));  // new
					if ( $show_post_date )  // new
						$before = $beforebefore . '<span class="recentdate">' . $arc_date . '</span>';  // new
					if ( $arc_title )
						$text = strip_tags(apply_filters('the_title', $arc_title));
					else
						$text = $arcresult->ID;
					echo get_archives_link($url, $text, $format, $before, $after);
				}
			}
		}
	}

The lines ending in ‘// new’ are the only changes.

So you want the date to appear after the title? Edit the plugin to modify $after, instead:

	} elseif ( ( 'postbypost' == $type ) || ('alpha' == $type) ) {
		('alpha' == $type) ? $orderby = "post_title ASC " : $orderby = "post_date DESC ";
		$arcresults = $wpdb->get_results("SELECT * FROM $wpdb->posts $join $where ORDER BY $orderby $limit");
		if ( $arcresults ) {
			$afterafter = $after;  // new
			foreach ( $arcresults as $arcresult ) {
				if ( $arcresult->post_date != '0000-00-00 00:00:00' ) {
					$url  = get_permalink($arcresult);
					$arc_title = $arcresult->post_title;
					$arc_date = date('j F Y', strtotime($arcresult->post_date));  // new
					if ( $show_post_date )  // new
						$after = '&nbsp;(' . $arc_date . ')' . $afterafter;  // new
					if ( $arc_title )
						$text = strip_tags(apply_filters('the_title', $arc_title));
					else
						$text = $arcresult->ID;
					echo get_archives_link($url, $text, $format, $before, $after);
				}
			}
		}
	}

Download

Get the files here: (Current version: 0.1 beta)

Download the Ardamis DateMe WordPress Plugin