Tag Archives: javascript

How to compress .php, .css and .js files without mod_gzip or mod_deflate

File compression is possible on Apache web hosts that do not have mod_gzip or mod_deflate enabled, and it’s easier than you might think.

A great explanation of why compression helps the web deliver a better user experience is at betterexplained.com.

Two authoritative articles on the subject are Google’s Performance Best Practices documentation | Enable compression and Yahoo’s Best Practices for Speeding Up Your Web Site | Gzip Components.

Compressing PHP files

If your Apache server does not have mod_gzip or mod_deflate enabled (Godaddy and JustHost shared hosting, for example), you can use PHP to compress pages on-the-fly. This is still preferable to sending uncompressed files to the browser, so don’t worry about the additional work the server has to do to compress the files at each request.

Option 1: PHP.INI using zlib.output_compression

The zlib extension can be used to transparently compress PHP pages on-the-fly if the browser sends an “Accept-Encoding: gzip” or “deflate” header. Compression with zlib.output_compression seems to be disabled on most hosts by default, but can be enabled with a custom php.ini file:

[PHP]

zlib.output_compression = On

Credit: http://php.net/manual/en/zlib.configuration.php

Check with your host for instructions on how to implement this, and whether you need a php.ini file in each directory.

Option 2: PHP using ob_gzhandler

If your host does not allow custom php.ini files, you can add the following line of code to the top of your PHP pages, above the DOCTYPE declaration or first line of output:

<?php if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); ?>

Credit: GoDaddy.com

For WordPress sites, this code would be added to the top of the theme’s header.php file.

According to php.net, using zlib.output_compression is preferred over ob_gzhandler().

For WordPress or other CMS sites, an advantage of zlib.output_compression over the ob_gzhandler method is that all of the .php pages served will be compressed, not just those that contain the global include (eg.: header.php, etc.).

Running both ob_gzhandler and zlib.output_compression at the same time will throw a warning, similar to:

Warning: ob_start() [ref.outcontrol]: output handler ‘ob_gzhandler’ conflicts with ‘zlib output compression’ in /home/path/public_html/ardamis.com/wp-content/themes/mytheme/header.php on line 7

Compressing CSS and JavaScript files

Because the on-the-fly methods above only work for PHP pages, you’ll need something else to compress CSS and JavaScript files. Furthermore, these files typically don’t change, so there isn’t a need to compress them at each request. A better method is to serve pre-compressed versions of these files. I’ll describe a few different ways to do this, but in both cases, you’ll need to add some lines to your .htaccess file to send user agents the gzipped versions if they support the encoding. This requires that Apache’s mod_rewrite be enabled (and I think it’s almost universally enabled).

Option 1: Compress locally and upload

CSS and JavaScript files can be gzipped on the workstation, then uploaded along with the uncompressed files. Use a utility like 7-Zip (quite possibly the best compression software around, and it’s free) to compress the CSS and JavaScript files using the gzip format (with extension *.gz), then upload them to your server.

For Windows users, here is a handy command to compress all the .css and .js files in the current directory and all sub directories (adjust the path to the 7-Zip executable, 7z.exe, as necessary):

for /R %i in (*.css *.js) do "C:\Program Files (x86)\7-Zip\7z.exe" a -tgzip "%i.gz" "%i" -mx9

Note that the above command is to be run from the command line. The batch file equivalent would be:

for /R %%i in (*.css *.js) do "C:\Program Files (x86)\7-Zip\7z.exe" a -tgzip "%%i.gz" "%%i" -mx9

Option 2: Compress on the server

If you have shell access, you can run a command to create a gzip copy of each CSS and JavaScript file on your site (or, if you are developing on Linux, you can run it locally):

find . -regex ".*\(css\|js\)$" -exec bash -c 'echo Compressing "{}" && gzip -c --best "{}" > "{}.gz"' \;

This may be a bit too technical for many people, but is also much more convenient. It is particularly useful when you need to compress a large number of files (as in the case of a WordPress installation with multiple plugins). Remember to run it every time you automatically update WordPress, your theme, or any plugins, so as to replace the gzip’d versions of any updated CSS and JavaScript files.

The .htaccess (for both options)

Add the following lines to your .htaccess file to identify the user agents that can accept the gzip encoded versions of these files.

<files *.js.gz>
  AddType "text/javascript" .gz
  AddEncoding gzip .gz
</files>
<files *.css.gz>
  AddType "text/css" .gz
  AddEncoding gzip .gz
</files>
RewriteEngine on
#Check to see if browser can accept gzip files.
ReWriteCond %{HTTP:accept-encoding} gzip
RewriteCond %{HTTP_USER_AGENT} !Safari
#make sure there's no trailing .gz on the url
ReWriteCond %{REQUEST_FILENAME} !^.+\.gz$
#check to see if a .gz version of the file exists.
RewriteCond %{REQUEST_FILENAME}.gz -f
#All conditions met so add .gz to URL filename (invisibly)
RewriteRule ^(.+) $1.gz [QSA,L]

Credit: opensourcetutor.com

I’m not sure it’s still necessary to exclude Safari.

For added benefit, minify the CSS and JavaScript files before gzipping them. Google’s excellent Page Speed Firefox/Firebug Add-on makes this very easy. Yahoo’s YUI Compressor is also quite good.

Verify that your content is being compressed

Use the nifty Web Page Content Compression Verification tool at http://www.whatsmyip.org/http_compression/ to confirm that your server is sending the compressed files.

Speed up page load times for returning visitors

Compression is only part of the story. In order to further speed page load times for your returning visitors, you will want to send the correct headers to leverage browser caching.

The Compete.com tracking code is not for traffic

I had noticed mentions of analytic information provided by Compete.com often enough that I was curious about what it could do for me.

Compete already had some information about ardamis.com, but it was stunningly wrong. For example, it was telling me that the phrase “godaddy referral program” was responsible for 20.35% of the total traffic sent to my site by search engines. Until recently, I did have a page that mentioned godaddy referral programs, but according to Google Analytics, it was barely ever visited (7 page views in the last 30 days – it was the 78th most popular page on my site, which does get a few thousand visitors a month). Even more strange, it told me that I was getting traffic for the search phrase “myspace”. I have never written anything about myspace before.

I figured that once I installed their JavaScript tracking code, the analytics information would be much more accurate. So I installed the code, confirmed it appeared at the bottom of the home page, and attempted to verify my site at //ardamis.com/, but was unable to. I had read somewhere that the free account does not support tracking subdomains, and the verification process seemed to get hung up on the use of .htaccess to redirect non-www traffic to www.ardamis.com. I was mystified that Compete apparently could not recognize this was happening via a 301 redirect header and compensate.

Sorry,

It looks like the CompeteXL code has not been correctly placed on the homepage of your site.

This could be because either the code was not copied over correctly, or because it has been placed on the wrong page.

We think your homepage is

ardamis.com

I went so far as to email my amazement to their support staff, who promptly and politely wrote me back. (Thumbs up to the guys answering the emails.)

I had made a few other changes to my site at the same time, so I ran a Page Speed check on it. Page Speed told me that I was linking to a resource at trgc.opt.fimserve.com that was throwing a 404 error. I was pretty sure I didn’t intentionally link to anything at that domain, so I Googled it. Surprisingly, there’s not much out there on trgc.opt.fimserve.com other than this and this. As it turns out, fimserve.com is part of something called the FOX Audience Network, and FAN’s parent company is News Corporation, which also owns myspace.com.

Here’s the WHOIS on fimserve.com:

Domain Name: FIMSERVE.COM
Registrar: REGISTER.COM, INC.
Whois Server: whois.register.com
Referral URL: http://www.register.com
Name Server: NS1.MYSPACE.COM
Name Server: NS2.MYSPACE.COM
Status: clientTransferProhibited
Updated Date: 17-oct-2006
Creation Date: 17-oct-2006
Expiration Date: 17-oct-2011

And here’s the WHOIS on foxaudiencenetwork.com:

Domain Name: FOXAUDIENCENETWORK.COM
Registrar: MARKMONITOR INC.
Whois Server: whois.markmonitor.com
Referral URL: http://www.markmonitor.com
Name Server: NS1.MYSPACE.COM
Name Server: NS2.MYSPACE.COM
Status: clientDeleteProhibited
Status: clientTransferProhibited
Status: clientUpdateProhibited
Updated Date: 03-may-2010
Creation Date: 03-jun-2008
Expiration Date: 03-jun-2011

I didn’t like the idea that information about my visitors was being shared with anyone but the site I had signed up for, so I started looking through the Compete FAQs and found this:

Currently, the CompeteXL code tracks ONLY self-reported Audience Profile data through a partnership with the FOX Audience Network.

The CompeteXL code DOES NOT track traffic or user engagement metrics, that information continues to be provided through our multi-sourced panel and requires NO addition of code to your site.

http://www.compete.com/help/q225

What the hell? Why am I installing a tracking code if it’s not used to track traffic?

Oh, and this was a fun discovery, too:

The FOX Audience Network (FAN) is a unit of News Corporation that supports monetization efforts across the company’s online content portfolio, as well as third-party publisher sites.

FAN leverages proprietary advertising technology to create highly-targeted advertising campaigns for a wide range of marketers, while also delivering cutting-edge tools and services to third-party publisher partners. FAN works directly with hundreds of advertisers to develop customized marketing programs that optimize both branded and performance-based strategies.

http://www.foxaudiencenetwork.com/aboutus.php

I use a very popular HOSTS file to block a huge number of servers that are known to host advertisements, tracking scripts (including Google Analytics), parasites, hijackers and unwanted Adware/Spyware programs. The 404 error in Page Speed was caused by the inclusion of trgc.opt.fimserve.com in this custom HOSTS file, which then led me to finding all this out the next day. I’ve removed the tracking code because the information I wanted was on traffic – who’s coming to my site, why, and through what means – and not user demographics.

HTML auction template generator

A friend who does a lot of selling on eBay asked me to develop a web application for generating HTML templates that could be copied and pasted into his auctions. He wanted to be able to add a title, a description, and some terms and conditions, and also to be able to upload images so the auction template would display thumbnails that could be clicked to display full-sized versions in a new window or tab. The more we talked, the more it sounded like something that would be both useful for a good number of people, and potentially a source of ad revenue for us. And so was born Simple Auction Wizard, an online HTML template generator for auction websites like eBay.

I was looking for just this kind of a project because I wanted a reason to play with TinyMCE, an Open Source, platform independent, web-based JavaScript HTML WYSIWYG editor control, and SWFUpload, a small JavaScript/Flash library that does very neat things with file uploads.

The most difficult part was actually getting the templates to look good in a variety of browsers. Because the template HTML appears inside of a larger page, I couldn’t rely on Doctype switching to place IE into standards mode. This meant the templates would have to be developed so that they would look approximately the same in standards-compliant browsers like Firefox, Chrome, and Safari, as well as horrible browsers like IE6 in quirks-mode. I tried very hard to minimize the use of tables, but they eventually crept in. I was able to use CSS, for the most part, though if any more issues come up, I’m going to throw in the towel and just do inline styles.

The web app is open to the public, but realize that it’s very early in its development and I intend to continue making changes.

Minifi.de – Another URL shortening service

I’ve finally opened a Twitter account, so you can follow me at http://twitter.com/ardamis. As a social experiment, it’s interesting to watch and discuss, but I haven’t really participated much. On the other hand, I’m very interested in the phenomenon of URL shortening. So, without too much trouble, I put together yet another URL shortening service: Minifi.de (‘minified’, in the jargon). It does pretty much the same thing as tinyurl.com, bit.ly and is.gd – you enter a long URL and it returns a shorter one. I’m about 50% done with the API (it works, so long as the URL valid), so if anyone knows of any clients that allow the user to specify a shortening service, I’d like to test that functionality. I have plans to make it account-based, so that you can track usage statistics and such, but that will only happen after I’m confident that the thing will survive in the wild.

So, feel free to give it a whirl, but know that it’s still in development and that I’ll probably have to wipe the database a few more times before I get it just right.

Interrupting a script.aculo.us event

A client wanted me to implement a drop down contact form that was triggered when the cursor hovered over a button in the site’s navigation. This was no problem, and I delivered some code using script.aculo.us‘s blind effect. A few hours later, the client contacted me and said that as he moved his cursor around the screen, he kept accidently triggering the contact form as the cursor moved through the element, and couldn’t I find a way to make the action a little more deliberate.

As far as I can tell, there is a way to delay a blind effect in script.aculo.us, but no way to stop the effect once the delay’s countdown has begun.

I searched around and came upon a w3schools JavaScript tutorial about timing events and the setTimeout() and clearTimeout() methods. This was exactly what I needed. By calling a function that started a countdown to the event rather than the event itself, and later calling a function that cancelled the countdown, it is possible to abort the drop down action if the mouse was moved out of the element within a specified time interval.

The code snippet below assumes you have links to the script.aculo.us and prototype libraries in place already. As the cursor enters the link, the startForm() function is called and a 500 ms timer begins. If the cursor moves out of the link, the stopForm() function is called, and the timer is cancelled. If stopForm() is called before the timer finishes counting, the effect never happens.

The client’s happy and I learned something useful.

The Code

<script type="text/javascript"><!--//--><![CDATA[//><!--
var t
function startForm() {
	t=setTimeout("Effect.toggle('contactform','blind')",500);
}
function stopForm() {
	clearTimeout(t);
}
//--><!]]></script>

A simple demonstration of how to implement this using inline JavaScript (a no-no).

<ul id="nav">
	<li><a href="#">a link</a></li>
	<li><a href="#">a link</a></li>
	<li><a href="#" onmouseover="startForm()" onmouseout="stopForm()">Contact Us</a></li>
</ul>
<div id="contactform">
        <!-- the form goes here -->
</div>

Defeating WordPress comment spam

Comment spam comes from humans who are paid to post it and robots/scripts that do it automatically. The majority of spam comes from the bots. There’s very little one can do to defend against a determined human being, but bots tend to behave predictably, and that allows us to develop countermeasures.

From my observations, it seems that the spambots are first given a keyword phrase to hunt down. They go through the search engine results for pages with that keyword phrase, and follow the link to each page. If the page happens to be a WordPress post, they pass their spammy content to the comment form. Apparently, they do this in a few different ways. Some bots seem to inject their content directly into the form processing agent, a file in the blog root named wp-comments-post.php, using the WordPress default form field IDs. Other bots seem to fill in any fields they come across before submitting the form. Still others seem to fill out the form, but ignore any unexpected text input fields. All of these behaviors can be used against the spammers.

One anti-spam technique that has been used for years is to rename the script that handles the form processing. If you look at the HTML of a WordPress post with comments enabled, you’ll see a line that reads:

<form action="http://yourdomain.com/wp-comments-post.php" method="post" id="commentform">

The ‘wp-comments-post.php’ file handles the comment form processing, and a good amount of spam can be avoided by simply renaming it. Many bots will try to pass their content to directly to that script, even if it no longer exists, to no effect.

More sophisticated bots will look through the HTML for the URL of the form processing script, and in doing so will learn the URL of the newly renamed script and pass their contents to that script instead. The trick is to prevent these smarter bots from discovering the URL of the new script. Because it seems that the bots aren’t looking through external JavaScript files yet, that’s where we will hide the URL. (If you do use this technique, it would be very considerate to tell your visitors that JavaScript is required to post comments.)

Step 1

Rename the wp-comments-post.php file to anything else. Using a string of random hexadecimal characters would be ideal. Once you’ve renamed the file, enter the address of the file in your browser. The page should be blank; if you get a 404 error, something is wrong. Make a note of that address, because you’ll need it later. Verify that the wp-comments-post.php file is empty or is no longer on your server.

(Because I was curious about how many bots were hitting the wp-comments-post.php file directly, I replaced the code with a hitcounter. Sure enough, bots are still hitting the file directly, even though there is no longer any path leading to it.)

Step 2

Open up the ‘comments.php’ file in your theme directory. Find the line:

<form action="http://yourdomain.com/wp-comments-post.php" method="post" id="commentform">

and change the value of the action attribute to a number sign (or pound sign, or hash), like so:

<form action="#" method="post" id="commentform">

Any bots that come to the page and search for the path to your comment processing script will just see the hash, so they will never discover the URL to the real script. This change also means that if a bot or a visitor tries to submit the form, the form will fail, because a WordPress single post page isn’t designed to process forms. We want the bots to fail, but we’ll need to put things right for humans.

If you are tempted to designate a separate page for the action value, note that the only people likely to ever see this page are visitors without JavaScript enabled who fill out the form.

Step 3

Create a new JavaScript file with the following code.

function commentScriptReveal() {
	// enter the URL of your renamed wp-comments-post.php file below
	var scriptPath = "http://yourdomain.com/renamed-wp-comments-post.php";
	document.getElementById("commentform").setAttribute("action", scriptPath);
}

Enter the address of your renamed file as the value for the variable scriptPath. The function commentScriptReveal, when called, will find the element with the ID ‘commentform’ (that’s the comment form) and change its action attribute to the URL of the renamed file, allowing the form to be successfully sent to the processing agent.

Save the file as ‘commentrevealer.js’ and upload it to the /scripts/ directory in your blog’s root. Add the script to your theme’s header.php file:

<script src="<?php echo bloginfo('url'); ?>/scripts/commentrevealer.js" type="text/javascript"></script>

Now we just need to decide how to call the commentScriptReveal function.

Step 4

The ideal method of calling the function would be the one where the human visitor always calls the function, and the bot never calls it. To do this, we need to know something about how the bots work.

Step 4a — For spam bots that ignore unexpected text input fields:

If the bots ignore unexpected text input fields, we can simply add a field, label it ‘required’, and attach the script revealer to that field with one of the following event handlers:

onchange triggered when the user changes the content of a field
onkeypress triggered when a keyboard key is pressed or held down
onkeydown triggered when a keyboard key is pressed
onkeyup triggered when a keyboard key is released
onfocus triggered when an element gets focus
onblur triggered when an element loses focus

I’m intentionally vague about how to trigger the function commentScriptReveal because this technique will be efficacious longer if different people use different events. Furthermore, the text input field doesn’t necessarily need to do anything, its contents will just be discarded when the form is processed. In fact, it doesn’t even need to be a text input field. It can be any form control—a button, a checkbox, a radio button, a menu, etc. We just need human visitors to interact with it somehow. Those bots that skip over the control won’t trigger the revealer event, and your visitors (who always follow directions) will.

If everyone goes about implementing this method in a slightly different way, the spammers should find it much more difficult to counter.

For further reading on JavaScript events: QuirksMode – Javascript – Introduction to Events.

Step 4b — For spam bots that add text to every input field they come across:

If the bots are hitting every text input field with some text, follow Step 4a, and then create a second JavaScript file, named ‘commentconcealer.js’, with the following code:

function commentScriptConceal() {
	// enter the URL of your renamed wp-comments-post.php file below
	var scriptPath = "#";
	document.getElementById("commentform").setAttribute("action", scriptPath);
}

The function commentScriptConceal re-rewrites the action attribute back to “#“.

Upload the file and add the script to your theme’s header.php file:

<script src="<?php echo bloginfo('url'); ?>/scripts/commentconcealer.js" type="text/javascript"></script>

Add another text input field somewhere below the one you added in Step 4a. Hide this field from visitors with {display: none;}. Call the function with an onfocus (or onblur, etc.) event on the second input field:

<p style="display: none;"><input type="text" name="reconceal" id="reconceal" value="" size="22" onfocus="return commentScriptConceal()" />
<label for="reconceal"><small>OMG Don't Touch This Field!?!?</small></label></p>

The legitimate visitors will never trigger this, but any bot that interacts with every field on a page will.

Step 5

If you chose to use a text input field in Step 4a, consider making that a challenge-response test. It’s always a good idea to use a server-side check to back up a client-side check. I like to ask the question “What color is an orange?” as a tip of the hat to Eric Meyer. This challenge-response test can be integrated into wp-comments-post.php, so that if the user fails the test, the form submission dies in the same way it would if a required field were left blank.

For this example, let’s ask the question “What color is Kermit the Frog?” The answer, of course, is green.

Open your wp-comments-post.php file, and (in WP version 2.2.2) somewhere around line 31 add:

$comment_verify = strtolower(trim($_POST['verify']));

This will trim any whitespace from the answer to the challenge question, and convert the characters to lowercase.

Around line 45, find the lines:

} else {
	if ( get_option('comment_registration') )
		wp_die( __('Sorry, you must be logged in to post a comment.') );
}

And add the following lines to them, as so:

} else {
	if ( get_option('comment_registration') )
		wp_die( __('Sorry, you must be logged in to post a comment.') );
	if ( $comment_verify != 'green' )
		wp_die( __('Sorry, you must correctly answer the "Kermit" question to post a comment.') );
}

This will check the visitor’s answer against the correct answer, and if they don’t match, the script will stop and the comment won’t be submitted. The visitor can hit the back button to change the answer.

Now we need to add the challenge question to the theme file comments.php. Find the lines:

<p><input type="text" name="url" id="url" value="<?php echo $comment_author_url; ?>" size="22" tabindex="3" />
<label for="url"><small>Website</small></label></p>

And right below them, add the following lines:

<p><input type="text" name="verify" id="verify" value="" size="22" tabindex="4" />
<label for="verify"><small>What color is Kermit the Frog? (anti-spam) (required)</small></label></p>

(You may need to update all the tabindex numbers after you add this field.)

That’s it, your visitors will now have to correctly answer the Kermit the Frog question to submit the form.

Further customization

The methods described here just scratch the surface of what can be done to obfuscate the comment handling script. For example, the URL could be broken up into parts and saved as multiple variables. It could have a name comprised of numbers, and the commentScriptReveal JavaScript could perform math to assemble it.

I can’t imagine I’m the first person to come up with this, but… the user could be required to complete a challenge-response question, the correct answer to which would be used as part of the name of the script—the URL to the script doesn’t exist anywhere until the visitor creates it.

But, there are some people who don’t want to impose even a minor inconvenience on their visitors. If you don’t like challenge-response tests, what about using JavaScript to invisibly check the screen resolution of the user agent? And we haven’t even considered using cookies yet.

Credits

Many thanks to Jeff Barr for demonstrating how to put a challenge-response test into the wp-comments-post.php file in his post WordPress Comment Verification (With Source Code). I had been using only JavaScript for validation up until that point, shame on me.

Many thanks also to Will Bontrager for writing a provocative explanation of how to temporarily hide the form processing script using JavaScript in Spamming You Through Your Own Forms.

Huge thanks to Eric Meyer, who wrote a pre-plugin, challenge-response test script called WP-Gatekeeper for a very early version of WordPress.

I’ve described two similar methods for defeating contact form spam by hiding the webmail script in an eariler post.

Defeating contact form spam by hiding the webmail script

My clients and I have been receiving increasing amounts of spam sent through our own contact forms. Not being a spammer myself, I’m left to speculate on how one sends spam through a webmail form, but I’ve come up with two ways of preventing it from happening. Both of these methods involve editing the contact form’s HTML and adding a JavaScript file. They also require that legitimate users of the contact form have DOM-compliant browsers with JavaScript enabled.

Defeating human-like robots

For a very long time, I suspected that the spammers’ bots were filling out and submitting forms just like regular human visitors. They would look for input fields with labels like ‘name’ and ’email’, and, of course, for textarea elements. The bots would enter values into the fields and hit the submit button and move on to the next form.

To combat this, one could institute a challenge-response test in the form of a question that must be correctly answered before the form is submitted. Eric Meyer wrote a very inspiring piece at WP-Gatekeeper on the use of easily human-comprehensible challenge questions like “What is Eric’s first name?” as a way to defeat spambots. There are a number of accessibility concerns and limitations with this method, mostly with respect to choosing a challenge question that any human being (of any mental or physical capacity, speaking any language, etc.) could answer, but that a robot would be unable to recognize as a challenge question or be unable to correctly answer. However, these issues also exist with the CAPTCHA method.

In this case, the challenge question will be What color is an orange? If answered correctly, the form is submitted. If answered incorrectly, the user is prompted to try again.

Here’s how to implement a challenge question method of form validation:

First, create a JavaScript file named ‘validate.js’ with the following lines:

function validateForm()
{
    valid = true;

    if ( document.getElementById('verify').value != "orange" )
    {
        alert ( "You must answer the 'orange' question to submit this form." );
		document.getElementById('verify').value = "";
		document.getElementById('verify').focus();
		valid = false;
    }

    return valid;
}

This script gets the value of the input field with an ID of ‘verify’ and if the value is not the word ‘orange’, the script returns ‘false’ and doesn’t allow the form to post. Instead, it pops up a helpful alert, erases the contents of the ‘verify’ field, and sets the cursor at the beginning of the field.

Add the JavaScript to your HTML with something like:

<head>
...
<script src="validate.js" type="text/javascript"></script>
...
</head>

Next, modify the form to call the function with an onSubmit event. This event will be triggered when the form’s Submit button is activated. Add an input field with the ID ‘verify’ and an onChange event to convert the value to lowercase. Add the actual challenge question as a label.

<form id="contactform" action="../webmail.php" method="post" onsubmit="return validateForm();">
...
	<div><input type="text" name="verify" id="verify" value="" size="22" tabindex="1" onchange="javascript:this.value=this.value.toLowerCase();" /></div>
	<label for="verify">What color is an orange?</label>
...
</form>

A visitor to the site who fills out the form but does not correctly answer the challenge question will not be able to submit the form.

Defeating non-human-like robots

I believe that the challenge-response method is becoming less effective, however. According the article ‘Spamming You Through Your Own Forms‘ by Will Bontrager, the spammers’ bots are not using the form as it is intended.

This is what appears to be happening: Spammers’ robots are crawling the web looking for forms. When the robot finds a form:

It makes a note of the form field names and types.

It makes a note of the form action= URL, converting it into an absolute URL if needed.

It then sends the information home where a database is updated.

Dedicated software uses the database information to insert the spammer’s spew into your form and automatically submit it to you.

His response is to stop the process at step 2 by eliminating the bots’ access to the webmail script. He suggests doing this by hiding the URL of the webmail script in an external JavaScript file, then using JavaScript to delay the writing of the form’s action attribute for a moment. The robots parsing just the page’s HTML never locate the URL to the webmail script, so it is never available for the spammers to exploit.

While I like the idea, I think I’ve come up with a better way of implementing it.

First, rename the webmail script, because the spammers already know the name and location of that script. For example, if GoDaddy is your host, contact forms on your site may be handled by ‘gdform.php’, located in the server root. You’ll need to rename that to something else. For purposes of illustration, I’ll rename the script ‘safemail.php’, but a string of random hexadecimal characters would be even better.

Next, give your contact form an ID. If you are running WordPress or other blogging software, be sure to give the contact form a different ID than the comment form, or else the JavaScript will cause the comment form to post to the webmail script. I’ll give my contact form the ID ‘contactform’.

<form id="contactform" action="../gdform.php" method="post">

We want to prevent the spammers from learning about the newly renamed script. This is done by giving the URL to a fake webmail script as the form’s action attribute and using JavaScript to change the action attribute of the form to the real webmail script only after some user interaction has occurred. I’ll use ‘no-javascript.php’ as my fake script.

To accommodate visitors who aren’t using JavaScript, the fake script could instead be a page explaining that JavaScript is required to submit the contact form and offering an alternate way to contact the author.

Edit the contact form’s action attribute to point to the fake script.

<form id="contactform" action="no-javascript.php" method="post">

Create a new, external JavaScript file called ‘protect.js’, with the following lines:

function formProtect() {
	document.getElementById("contactform").setAttribute("action","safemail.php");
}

The function formProtect, when called, finds the HTML element with ID ‘contactform’ and changes its ‘action’ attribute to ‘safemail.php’. Obviously, one could make this script more complex and potentially more difficult for spammers to parse through the use of variables, but I don’t see that as necessary at this point.

Add the JavaScript to your HTML with something like:

<head>
...
<script src="formprotect.js" type="text/javascript"></script>
...
</head>

Finally, call the script at some point during the process of filling out the form. Exactly how you want to do this is up to you, and it’ll be effective longer if you don’t share how you do it. Perhaps the most straight-forward way would be to call the script at the point of submission by adding onsubmit="return formProtect();" to the <form> element.

<form id="contactform" action="no-javascript.php" method="post" onsubmit="return formProtect();">

If you want to use both the challenge question and the action rewriting functions, you may want to combine them into a single file or trigger formProtect separately with an event on one of the required input fields. If you decide to trigger formProtect with an event other than onsubmit, consider usability/accessibility issues—not everyone uses a mouse.

In conclusion

By implementing both of these methods, it is possible to dramatically reduce or even completely stop contact form spam. In the two months since I implemented this system, I haven’t received a single spam email from any of my contact forms.

The challenge-response test should deter or at least hinder human spammers and robots that fill out forms as though they were human. The trade-off is some added work for legitimate users of the form.

The action attribute rewriting method should immediately eliminate all spam sent directly to your form by spammers who have the URL of your webmail script in their databases. It should also prevent the rediscovery of the URL. Visitors with JavaScript enabled won’t be aware of the anti-spam measures.

For WordPress users

Defeating WordPress comment spam explains how to apply the attribute rewriting method to your WordPress site.