Tag Archives: javascript

I like Akismet, and it’s undeniably effective in stopping the vast majority of spam, but it adds a huge number of comments to the database and a very small percentage of comments still get through to my moderation queue.

It’s annoying to find comments in my moderation queue, but what I really object to is the thousands of records that are added to the database each month that I don’t see.

In the screenshot below, January through April show very few spam comments being detected by Akismet. This is because I was using my cache-friendly method for reducing WordPress comment spam to block spam comments even before Akismet analyzed them.

Akismet stats

In May, I moved hosting providers to asmallorange.com and started with a fresh install of WordPress without implementing my custom spam method, which admittedly was not ideal because it involved changing core files. This left only Akismet between the spammers and my WordPress database. Since that time, instead of 150 or fewer spam comments per month making it into my WordPress database, Akismet was on pace to let in over 10,000.

So, in the spirit of fresh starts and doing things the right way, I created a WordPress plug-in that uses the same timestamp method. It’s actually exactly the same JavaScript and PHP code, just in plug-in form, so it’s not bound to any core files or theme files.

Google, what is up with your Google+ profile badges and your Google +1 buttons being different sizes?

There is a neat wizard for creating the code snippet for a Google+ profile badge. We get to pick an image size from one of four options, if we want the image to be hosted on Google’s server. (And why wouldn’t we?)

Small (16px)
Standard (32px)
Medium (44px)
Tall (64px)

OK, those are pretty acceptable, I guess, but I would really like to see something in the 20 to 24 pixel range.

What about the wizard for the Google +1 button?

Small (15px)
Standard (24px)
Medium (20px)
Tall (60px)

Wait, what?!? You’re using the same labeling, but the sizes are totally different. Not only that, but none of the sizes are shared between the two buttons. The Standard profile button is 75% larger than the +1 button. Grrr.

The Google +1 button wizard has more options, including a field where you may specify the path to an image, and the button itself is more dynamic. The profile button wizard is very basic, but it is easy to edit the HTML output to use any image. If forced to make my own image and host it, coming up with a custom profile button is clearly less involved.

As a third option, the configuration tool for the Google+ brand page badge makes it possible to essentially combine the functions of both buttons. While the badge takes up a large chunk of real estate, it is probably the best choice, as it looks good, has some bold colors, and adds some extra Google+ stuff (thumbnail images, a counter) that you can’t get from the basic generator.

Just a few weeks behind schedule, but a long time in the works, I’ve finally pushed the new WordPress theme for Ardamis live. Basic and elegant (I’m trying to establish a trend here), the theme also should outperform its predecessors in both page load times and SEO-potential. The index and archive pages should appear more consistent, and all pages should provide more complete structured data markup (schema.org as well as microformats.org). The comment form has been outfitted with an improved approach to reducing comment spam.

The new theme is pretty light on the graphics, due to increased browser support for and subsequently greater use of CSS3 goodness for box shadows and gradients. I’ve reduced the number of image files to two: a background and a sprites file.

Only half-implemented in the previous theme, the new look, “Joy”, makes much better use of structured data markup, or microdata. Google is absolutely looking for ways to display your pages’ semantic markup in its results, so you may as well get on board.

The frequency of spam comments increased dramatically over the past two months, according to my Akismet stats, so I’ve gone back to the drawing board and developed a better front-line defense against them. The new method should be more opaque to bots that parse JavaScript while still being invisible to human visitors leaving legitimate comments.

In sum, I think Ardamis should be leaner, faster, and smarter (and maybe prettier) in 2012 than ever before.

For a recent project, I needed to create a form that would perform a look up of people names in a MySQL database, but I wanted to use a single input field. To make it easy on the users of the form, I wanted the input field to accept names in either “Firstname Lastname” or “Lastname, Firstname” format, and I wanted it to autocomplete matches as the users typed, including when they typed both names separated by a space or a comma followed by a space.

The Ajax lookup was quick work with jQuery UI’s Autocomplete widget. The harder part was figuring out the most simple table structure and an appropriate SQL query.

A flawed beginning

My people table contains a “first_name” column and a “last_name” column, nothing uncommon there. To get the project out the door, I wrote a PHP function that ran two ALTER TABLE queries on the people table to create two additional columns for pre-formatted strings (column “firstlast”, to be formatted as “Firstname Lastname”, and column “lastfirst”, to be formatted as “Lastname, Firstname”), added indexes on these columns, and then walked through each record in the table, populating these new fields. I then wrote a very straight forward SQL query to perform a lookup on both fields. The PHP and query looked something like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT * FROM people WHERE (`firstlast` LIKE '%1\$s' OR `lastfirst` LIKE '%1\$s') ORDER BY `lastfirst` ASC",
mysql_real_escape_string($name. "%", $link));

This was effective, accurate, and pretty fast, but the addition of columns bothered me and I didn’t like that I needed to run a process to generate those pre-formatted fields each time a record was added to the table (or if a change was made to an existing record). One possible alternative was to watch the input and match either lastname or firstname until the user entered a comma or a space, then explode the string on the comma or space and search more precisely. Once a comma or a space was encountered, I felt pretty sure that I would be able to accurately determine which part of the input was the first name and which was the last name. But this had that same inefficient, clunky bad-code-smell as the extra columns. (Explode is one of those functions that I try to avoid using.) Writing lots of extra PHP didn’t seem necessary or right.

I’m much more comfortable with PHP than with MySQL queries, but I realize that one can do some amazing things within the SQL query, and that it’s probably faster to use SQL to perform some functions. So, I decided that I’d try to work up a query that solved my problem, rather than write more lines of PHP.

CONCAT_WS to the rescue

I Googled around for a bit and settled on using CONCAT_WS to concatenate the first names and last names into a single string be matched, but found it a bit confusing to work with. I kept trying to use it to create an alias, “lastfirst”, and then use the alias in the WHERE clause, which doesn’t work, or I was getting the literal column names back instead of the values. Eventually, I hit upon the correct usage.

The PHP and query now looks like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT *, CONCAT_WS(  ', ',  `last_name`,  `first_name` ) as lastfirst FROM people WHERE (CONCAT_WS(  ', ',  `last_name`,  `first_name` ) LIKE '%1\$s' OR CONCAT_WS(  ' ',  `first_name`,  `last_name` ) LIKE '%1\$s') ORDER BY lastfirst ASC",
mysql_real_escape_string($name. "%", $link));

The first instance of CONCAT_WS isn’t needed for the lookup. The first instance allows me to order the results alphabetically and provides me an array key of “lastfirst” with a value of the person’s name already formatted as “Lastname, Firstname”, so I don’t have to do it later with PHP. The lookup comes from the two instances of CONCAT_WS in the WHERE clause. I haven’t done any performance measuring here, but the results of the lookup get back to the user plenty fast enough, if not just as quickly as the method using dedicated columns.

The result of the query is output back to the page as JSON-formatted data for use in the jQuery Autocomplete.

The end result works exactly as I had hoped. A user of the form is able to type a person’s name in whatever way is comfortable to them, as “Bob Smith” or “Smith, Bob”, and the matches are found either way. The only thing it doesn’t do is output the matches back to the autocompleter in the same format that the user is using. But I can live with that for now.

Update 2015-01-02: About a month ago, in early December, 2014, Google announced that it was working on a new anti-spam API that is intended to replace the traditional CAPTCHA challenge as a method for humans to prove that they are not robots. This is very good news.
This week, I noticed that Akismet is adding a hidden input field to the comment form that contains a timestamp (although the plugin’s PHP puts the initial INPUT element within a P element set to DISPLAY:NONE, when the plugin’s JavaScript updates the value with the current timestamp, the INPUT element jumps outside of that P element). The injected code looks something like this:
<input type=”hidden” id=”ak_js” name=”ak_js” value=”1420256728989″>
I haven’t yet dug into the Akismet code to discover what it’s doing with the timestamp, but I’d be pleased if Akismet is attempting to differentiate humans from bots based on behavior.
Update 2015-01-10: To test the effectiveness of the current version of Akismet, I disabled the anti-spam plugin described in this post on 1/2/2015 and re-enabled it on 1/10/2015. In the span of 8 days, Akismet identified 1,153 spam comments and missed 15 more. These latest numbers continue to support my position that Akismet is not enough to stop spam comments.

In the endless battle against WordPress comment spam, I’ve developed and then refined a few different methods for preventing spam from getting to the database to begin with. My philosophy has always been that a human visitor and a spam bot behave differently (after all, the bots we’re dealing with are not Nexus-6 model androids here), and an effective spam-prevention method should be able to recognize the differences. I also have a dislike for CAPTCHA methods that require a human visitor to prove, via an intentionally difficult test, that they aren’t a bot. The ideal method, I feel, would be invisible to a human visitor, but still accurately identify comments submitted by bots.

Spam on ardamis.com in early 2012 - before and after

Spam on ardamis.com - before and after

A brief history of spam fighting

The most successful and simple method I found was a server-side system for reducing comment spam by using a handshake method involving timestamps on hidden form fields that I implemented in 2007. The general idea was that a bot would submit a comment more quickly than a human visitor, so if the comment was submitted too soon after the post page was loaded, the comment was rejected. A human caught in this trap would be able to click the Back button on the browser, wait a few seconds, and resubmit. This proved to be very effective on ardamis.com, cutting the number of spam comments intercepted by Akismet per day to nearly zero. For a long time, the only problem was that it required modifying a core WordPress file: wp-comments-post.php. Each time WordPress was updated, the core file was replaced. If I didn’t then go back and make my modifications again, I would lose the spam protection until I made the changes. As it became easier to update WordPress (via a single click in the admin panel) and I updated it more frequently, editing the core file became more of a nuisance.

A huge facepalm

When Google began weighting page load times as part of its ranking algorithm, I implemented the WP Super Cache caching plugin on ardamis.com and configured it to use .htaccess and mod_rewrite to serve cache files. Page load times certainly decreased, but the amount of spam detected by Akismet increased. After a while, I realized that this was because the spam bots were submitting comments from static, cached pages, and the timestamps on those pages, which had been generated server-side with PHP, were already minutes old when the page was requested. The form processing script, which normally rejects comments that are submitted too quickly to be written by a human visitor, happily accepted the timestamps. Even worse, a second function of my anti-spam method also rejected comments that were submitted 10 minutes or more after the page was loaded. Of course, most of the visitors were being served cached pages that were already more than 10 minutes old, so even legitimate comments were being rejected. Using PHP to generate my timestamps obviously was not going to work if I wanted to keep serving cached pages.

JavaScript to the rescue

Generating real-time timestamps on cached pages requires JavaScript. But instead of a reliable server clock setting the timestamp, the time is coming from the visitor’s system, which can’t be trusted to be accurate. Merely changing the comment form to use JavaScript to generate the first timestamp wouldn’t work, because verifying a timestamp generated on the client-side against one generated server-side would be disastrous.

Replacing the PHP-generated timestamps with JavaScript-generated timestamps would require substantial changes to the system.

Traditional client-side form validation using JavaScript happens when the form is submitted. If the validation fails, the form is not submitted, and the visitor typically gets an alert with suggestions on how to make the form acceptable. If the validation passes, the form submission continues without bothering the visitor. To get our two timestamps, we can generate a first timestamp when the page loads and compare it to a second timestamp generated when the form is submitted. If the visitor submits the form too quickly, we can display an alert showing the number of seconds remaining until the form can be successfully submitted. This client-side validation should hopefully be invisible to most visitors who choose to leave comments, but at the very least, far less irritating than a CAPTCHA system.

It took me two tries to get it right, but I’m going to discuss the less successful method first to point out its flaws.

Method One (not good enough)

Here’s how the original system flowed.

  1. Generate a first JS timestamp when the page is loaded.
  2. Generate a second JS timestamp when the form is submitted.
  3. Before the form contents are sent to the server, compare the two timestamps, and if enough time has passed, write a pre-determined passcode to a hidden INPUT element, then submit the form.
  4. After the form contents are sent to the server, use server-side logic to verify that the passcode is present and valid.

The problem was that it seemed that certain bots could parse JavaScript enough to drop the pre-determined passcode into the hidden form field before submitting the form, circumventing the timestamps completely and defeating the system.

Because the timestamps were only compared on the client-side, it also failed to adhere to one of the basic tenants of form validation – that the input must be checked on both the client-side and the server-side.

Method Two (better)

Rather than having the server-side validation be merely a check to confirm that the passcode is present, method two compares the timestamps a second time on the server side. Instead of a single hidden input, we now have two – one for each timestamp. This is intended to prevent a bot from figuring out the ultimate validation mechanism by simply parsing the JavaScript. Finally, the hidden fields are not in the HTML of the page when it’s sent to the browser, but are added to the form via jQuery, which makes it easier to implement and may act as another layer of obfuscation.

  1. Generate a first JS timestamp when the page is loaded and write it to a hidden form field.
  2. Generate a second JS timestamp when the form is submitted and write it to a hidden form field.
  3. Before the form contents are sent to the server, compare the two timestamps, and if enough time has passed, submit the form (client-side validation).
  4. On the form processing page, use server-side logic to compare the timestamps a second time (server-side validation).

This timestamp handshake works more like it did in the proven-effective server-side-only method. We still have to pass something from the comment form to the processing script, but it’s not too obvious from the HTML what is being done with it. Furthermore, even if a bot suspects that the timestamps are being compared, there is no telling from the HTML what the threshold is for distinguishing a valid comment from one that is invalid. (The JavaScript could be parsed by a bot, but the server-side check cannot be, making it possible to require a slightly longer amount of time to elapse in order to pass the server-side check.)

The same downside plagued me

For a long time, far longer than I care to admit, I stubbornly continued to modify the core file wp-comments-post.php to provide the server-side processing. But creating the timestamps and parsing them with a plug-in turned out to be a simple matter of two functions, and in June of 2013 I finally got around to doing it the right way.

The code

The plugin, in all its simplicity, is only 100 lines. Just copy this code into a text editor, save it as a .php file (the name isn’t important) and upload it to the /wp-content/plugins directory and activate it. Feel free to edit it however you like to suit your needs.

<?php

/*
Plugin Name: Timestamp Comment Filter
Plugin URI: //ardamis.com/2011/08/27/a-cache-proof-method-for-reducing-comment-spam/
Description: This plugin measures the amount of time between when the post page loads and the comment is submitted, then rejects any comment that was submitted faster than a human probably would or could.
Version: 0.1
Author: Oliver Baty
Author URI: //ardamis.com

    Copyright 2013  Oliver Baty  (email : [email protected])

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
*/

// http://wordpress.stackexchange.com/questions/6723/how-to-add-a-policy-text-just-before-the-comments
function ard_add_javascript(){

	?>
	
<script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
    ardGenTS1();
});
 
function ardGenTS1() {
    // prepare the form
    $('#commentform').append('<input type="hidden" name="ardTS1" id="ardTS1" value="1" />');
    $('#commentform').append('<input type="hidden" name="ardTS2" id="ardTS2" value="1" />');
    $('#commentform').attr('onsubmit', 'return validate()');
    // set a first timestamp when the page loads
    var ardTS1 = (new Date).getTime();
    document.getElementById("ardTS1").value = ardTS1;
}
 
function validate() {
    // read the first timestamp
    var ardTS1 = document.getElementById("ardTS1").value;
//  alert ('ardTS1: ' + ardTS1);
    // generate the second timestamp
    var ardTS2 = (new Date).getTime();
    document.getElementById("ardTS2").value = ardTS2;
//  alert ('ardTS2: ' + document.getElementById("ardTS2").value);
    // find the difference
    var diff = ardTS2 - ardTS1;
    var elapsed = Math.round(diff / 1000);
    var remaining = 10 - elapsed;
//  alert ('diff: ' + diff + '\n\n elapsed:' + elapsed);
    // check whether enough time has elapsed
    if (diff > 10000) {
        // submit the form
        return true;
    }else{
        // display an alert if the form is submitted within 10 seconds
        alert("This site is protected by an anti-spam feature that requires 10 seconds to have elapsed between the page load and the form submission. \n\n Please close this alert window.  The form may be resubmitted successfully in " + remaining + " seconds.");
        // prevent the form from being submitted
        return false;
    }
}
</script>
	
	<?php
}

add_action('comment_form_before','ard_add_javascript');

// http://wordpress.stackexchange.com/questions/89236/disable-wordpress-comments-api
function ard_parse_timestamps(){

	// Set up the elapsed time, in miliseconds, that is the threshold for determining whether a comment was submitted by a human
	$intThreshold = 10000;
	
	// Set up a message to be displayed if the comment is blocked
	$strMessage = '<strong>ERROR</strong>:  this site uses JavaScript validation to reduce comment spam by rejecting comments that appear to be submitted by an automated method.  Either your browser has JavaScript disabled or the comment appeared to be submitted by a bot.';
	
	$ardTS1 = ( isset($_POST['ardTS1']) ) ? trim($_POST['ardTS1']) : 1;
	$ardTS2 = ( isset($_POST['ardTS2']) ) ? trim($_POST['ardTS2']) : 2;
	$ardTS = $ardTS2 - $ardTS1;
	 
	if ( $ardTS < $intThreshold ) {
	// If the difference of the timestamps is not more than 10 seconds, exit
		wp_die( __($strMessage) );
	}
}
add_action('pre_comment_on_post', 'ard_parse_timestamps');

?>

That’s it. Not so bad, right?

Final thoughts

The screen-shot at the beginning of the post shows the number of spam comments submitted to ardamis.com and detected by Akismet each day from the end of January, 2012, to the beginning of March, 2012. The dramatic drop-off around Jan 20 was when I implemented the method described in this post. The flare-up around Feb 20 was when I updated WordPress and forgot to replace the modified core file for about a week, illustrating one of the hazards of changing core files.

If you would rather not add any hidden form fields to the comment form, you could consider appending the two timestamps to the end of the comment_post_ID field. Because its contents are cast as an integer in wp-comments-post.php when value of the $comment_post_ID variable is set, WordPress won’t be bothered by the extra data at the end of the field, so long as the post ID comes first and is followed by a space. You could then just explode the contents of the comment_post_ID field on the space character, then compare the last two elements of the array.

If you don’t object to meddling with a core file in order to obtain a little extra protection, you can rename the wp-comments-post.php file and change the path in the comment form’s action attribute. I’ve posted logs showing that some bots just try to post spam directly to the wp-comments-post.php file, so renaming that file is an easy way to cut down on spam. Just remember to come back and delete the wp-comments-post.php file each time you update WordPress.

So I finally watched The Social Network over the weekend, and it’s made me feel jealous and a bit guilty.

In a meager effort to console myself for so far failing to be a billionaire, I’m assembling the short list of web-application type things I’ve built here.

  1. A dice roller: rollforit. Enter a name, create a room, invite your friends, and start rolling dice. For people who want to play pen and paper, table-top RPG dice games with their distant friends.
  2. A URL shortener: Minifi.de. Minifi.de comes with an API and a bookmarklet. It really works, too! The technical explanation has more details.
  3. A social networking site: Snapbase. Snapbase is a social site that shows you what’s going on in your city or anywhere in the world as pictures are uploaded by your friends and neighbors. The application extracts location information from the EXIF data embedded in images and displays recent images taken near your present location.
  4. A trouble-ticketing system for an IT help desk or technical support center. It’s really pretty extensive, with asset management, user accounts, salted encrypted passwords, and all sorts of nifty things. I really must write a full description of it at some point, but until then, the documentation is the next best thing.
  5. An account-based invoice tracking and access system for grouping invoices according to clients, then sharing invoice history with those clients and allowing them to easily pay outstanding invoices via Paypal.
  6. An account-based invoice access system where clients can view paid and unpaid invoices, and even easily pay an outstanding invoice via Paypal. I actually use this almost every day.
  7. A simple method for protecting a download using a unique URL that can be emailed to authorized users. The URL can be set to expire after a certain amount of time or any number of downloads.
  8. An update to the above download protection script to protect multiple downloads, generate batches of keys, leave notes about who received the key, the ability to specify per-key the allowable number of downloads and age, and some basic reporting.
  9. An HTML auction template generator called Simple Auction Wizard. It helps you create HTML auction templates for eBay, and uses SWFUpload and tinyMCE.

I have another project in the works that promises to be more financially viable, but the most clever thing on that list is Snapbase. It’s in something akin to alpha right now; barely usable. I really wish I had the time to pursue it.

I’ve written a few tutorials lately on how to reduce page load times. While I use Google’s Page Speed Firefox/Firebug plugin for evaluating pages for load times, there are times when I want a second opinion, or want to point a client to a tool. This post is a collection of links to online tools for testing web page performance.

Page Speed Online

http://pagespeed.googlelabs.com/

Google’s wonderful Page Speed tool, once only available as a Firefox browser Add-on, finally arrives as an online tool. Achieving a high score (ardamis.com is a 96/100) should be on every web developer’s list of things to do before the culmination of a project.

Enter a URL and Page Speed Online will run performance tests based on a set of best practices known to reduce page load times.

  • Optimizing caching – keeping your application’s data and logic off the network altogether
  • Minimizing round-trip times – reducing the number of serial request-response cycles
  • Minimizing request overhead – reducing upload size
  • Minimizing payload size – reducing the size of responses, downloads, and cached pages
  • Optimizing browser rendering – improving the browser’s layout of a page

WebPagetest

http://www.webpagetest.org/

WebPagetest is an excellent application for users who want the same sort of detailed reporting that one gets with Page Speed.

  • Load time speed test on first view (cold cache) and repeat view (hot cache), first byte and start render
  • Optimization checklist
  • Enable keep-alive, HTML compression, image compression, cache static content, combine JavaScript and CSS, and use of CDN
  • Waterfall
  • Response headers for each request

Load Impact

http://loadimpact.com/pageanalyzer.php

Load Impact is an online load testing service that lets you load- and stress test your website over the Internet. The page analyzer analyzes your web page performance by emulating how a web browser would load your page and all resources referenced in it. The page and its referenced resources are loaded and important performance metrics are measured and displayed in a load-bar diagram along with other per-resource attributes such as URL, size, compression ratio and HTTP status code.

ByteCheck

http://www.bytecheck.com/

ByteCheck is a super minimal site that return your page’s all-important time to first byte (TTFB). Time to first byte is the time it takes for a browser to start receiving information after it has started to make the request to the server, and is responsible for a visitor’s first impression that a page is fast- or slow-loading.

Web Page Analyzer

http://websiteoptimization.com/services/analyze/

My opinion is that the Web Page Analyzer report is good for beginners without much technical knowledge of things like gzip compression and Expires headers. It’s a bit dated, and is primarily concerned with basics like how many images a page contains. It tells you how fast you can expect your page to load for dial-up visitors, which strikes me as quaint and not particularly useful.

  • Total HTTP requests
  • Total size
  • Total size per object type (CSS, JavaScript, images, etc.)
  • Analysis of number of files and file size as compared to recommended limits

The Performance Grader

http://www.joomlaperformance.com/component/option,com_performance/Itemid,52/

This is another simplistic analysis of a site, like Web Page Analyzer, that returns its analysis in the form of pass/fail grades on about 14 different tests. I expect that it would be useful for developers who want to show a client a third-party’s analysis of their work, if the third-party is not terribly technically savvy.

One unique thing about this tool, though, is that it totals up the size of all images referenced in CSS files (even those that the current page isn’t using).

  • HTML Size
  • Total Size
  • Total Requests
  • Generation Time
  • Number of Hosts
  • Number of Images
  • Size of Images
  • Number of CSS Files
  • Size of CSS Files
  • Number of Script Files
  • Size of Script Files
  • HTML Encoding
  • Valid HTML
  • Frames

A collection of web development tools for building better sites more easily.

Frameworks and scripts

HTML5 Boilerplate is the professional badass’s base HTML/CSS/JS template for a fast, robust and future-proof site.

scriptsrc.net is a collection of script tags of the latest versions of a range of JavaScript libraries.

Modernizr adds classes to the <html> element which allow you to target specific browser functionality in your stylesheet. You don’t actually need to write any Javascript to use it.

Images

placehold.it is a quick and simple image placeholder service.

Text manipulation

TextFixer is a collection of online text tools. Remove line breaks from text, alphabetize text, capitalize the first letter of sentences, remove whitespaces, and uppercase text or lowercase text.

XHTML

HTML Minifier will minify HTML (or XHTML), and any CSS or JS included in your markup.

CSS

CSS3 Generator is an awesome code generator for CSS3 snippets, and shows the minimum browser versions required to display the effects.

proCSSor is an advanced CSS prettifier with tons of formatting options.

JavaScript

Online javascript beautifier will reformat and reindent bookmarklets, ugly javascript, unpack scripts packed by the popular Dean Edward’s packer, as well as deobfuscate scripts processed by javascriptobfuscator.com. The source code for the latest version is always available on github, and you can download the beautifier for local use (zip, tar.gz) as well.

Fonts and Typography

Fontshop.com has written A Field Guide to Typography to get you excited about fonts and typography.

Typetester is an online app for comparing different fonts for the screen, you can test up to three fonts at a time and choose the one you like. Its primary role is to make web designer’s life easier.

A quick chart of the fonts common to all versions of Windows and the Mac equivalents, or a more extensive matrix of fonts bundled with Mac and Windows operating systems, Microsoft Office and Adobe Creative Suite.

<html>ipsum has Lorem ipsum already wrapped in HTML tags. Pre-made paragraphs, lists, etc…

More resources at: 50 Useful Design Tools For Beautiful Web Typography and 21 Typography and Font Web Apps You Can’t Live Without.

Colors

Color Scheme Designer.

Markup

Google Webmaster Tools’ Rich Snippets Testing Tool.

Use the Rich Snippets Testing Tool to check that Google can correctly parse your structured data markup and display it in search results.

I’ve written a few posts on how to speed up web sites by sending the correct headers to leverage browser caching and compressing .php, .css and .js files without mod_gzip or mod_deflate.

The intended audience for this post is developers who have already applied most or all of Google’s Web Performance Best Practices and Yahoo’s Best Practices for Speeding Up Your Web Site and are wondering how to speed up WordPress sites in particular.

I assume you’re familiar with WordPress caching and are already using a caching plugin, such as WP Super Cache, W3 Total Cache or the like.

Reduce HTTP requests

Reducing HTTP requests should be the very first thing step in speeding up any site. If you are using plugins, watch them carefully for inefficiencies like added CSS and JavaScript files. Combine, minify and compress these files. Some plugins allow you to turn off their bundled CSS in the plugin’s settings page. Where possible, copy the plugin’s CSS into the current theme’s style.css and turn off the extra file.

Delete deactivated plugins

Remove any plugins you’re not using. Deactivated plugins can be deleted from the Plugins page.

Speed up the mod_rewrite code

jdMorgan from WebmasterWorld.com has written a replacement for the .htaccess rewrite rule used by WordPress. This will speed up the WordPress mod_rewrite code by a factor of more than two.

This is a total replacement for the code supplied with WP as bounded by the “Begin WP” and “End WP” comments, and fixes several performance-affecting problems. Notably, the unnecessary and potentially-problematic container is completely removed, and code is added and re-structured to both prevent completely-unnecessary file- and directory- exists checks and to reduce the number of necessary -exists checks to one-half the original count (due to the way mod_rewrite behaves recursively in .htaccess context).

http://www.webmasterworld.com/apache/4053973.htm

The following code is adapted from the original to add png, flv and swf to the list of static file formats:

# BEGIN WordPress
# adapted from http://www.webmasterworld.com/apache/4053973.htm
#
RewriteEngine on
#
# Unless you have set a different RewriteBase preceding this point,
# you may delete or comment-out the following RewriteBase directive
# RewriteBase /
#
# if this request is for "/" or has already been rewritten to WP
RewriteCond $1 ^(index\.php)?$ [OR]
# or if request is for image, css, or js file
RewriteCond $1 \.(gif|jpg|png|ico|css|js|flv|swf)$ [NC,OR]
# or if URL resolves to existing file
RewriteCond %{REQUEST_FILENAME} -f [OR]
# or if URL resolves to existing directory
RewriteCond %{REQUEST_FILENAME} -d
# then skip the rewrite to WP
RewriteRule ^(.*)$ - [S=1]
# else rewrite the request to WP
RewriteRule . /index.php [L]
#
# END wordpress 

Only load the comment-reply.js when needed

In the default WordPress template, the comment-reply.js script is included on all single post pages, regardless of whether nested/threaded comments is enabled. A simple tweak changes the theme to only include comment-reply.js on single post pages only when it makes sense to do so: if threaded comments are enabled, commenting on that post is allowed, and a comment already exists.

Remove the following line from your theme’s header.php:

<?php if ( is_singular() ) wp_enqueue_script( 'comment-reply' ); ?>

Add the following lines to your theme’s functions.php.

// Don't add the wp-includes/js/comment-reply.js?ver=20090102 script to single post pages unless threaded comments are enabled
// adapted from http://bigredtin.com/behind-the-websites/including-wordpress-comment-reply-js/
function theme_queue_js(){
  if (!is_admin()){
    if (is_singular() && (get_option('thread_comments') == 1) && comments_open() && have_comments())
      wp_enqueue_script('comment-reply');
  }
}
add_action('wp_print_scripts', 'theme_queue_js');

Only load the l10n.js when needed

In WordPress 3.1, a l10n.js script is loaded. It is “mostly used for scripts that send over localization data from PHP to the JS side using wp_localize_script().” Whether it’s safe to remove this file seems to be a matter of debate, but should you decide you want to remove it…

Add the following lines to your theme’s functions.php.

// Don't add the wp-includes/js/l10n.js?ver=20101110 script to non-admin pages
// adapted from http://wordpress.stackexchange.com/questions/5451/what-does-l10n-js-do-in-wordpress-3-1-and-how-do-i-remove-it
function remove_l10n_js(){
  if (!is_admin()){
    wp_deregister_script('l10n');
  }
}
add_action('wp_print_scripts', 'remove_l10n_js');

Replace unecessary code executions and database queries

WordPress saves settings specific to your blog in the database. These settings include what language your blog is written in, whether text is read left-to-right or vice versa, the path to the template directory, etc.

The default WordPress theme contains a number of database queries in order to figure out these things and build the correct page. The default theme needs this flexibility, but your theme does not. Joost de Valk recommends replacing these database queries with static text in your theme files in his post Speed up WordPress, and clean it up too!

An easy way to do this is to browse to your web site and then view the source code. Copy the content that won’t change from page to page and paste it into your theme, overwriting the PHP with the generated HTML.

For example, my theme’s header.php file contains this line:

<html xmlns="http://www.w3.org/1999/xhtml" <?php language_attributes(); ?>>

The source code of the rendered page displays this line:

<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US">

On my blog, this HTML output is never going to be anything different, so why make WordPress look these settings up in the database each time a page is loaded? This line is an excellent candidate for replacement. The footer.php file contains a handful more opportunities for replacement, but go through each of your theme’s files and look for references to the template directory and other data that won’t change as long as you’re using the theme. All told, I was able to replace 12 database queries with static HTML.

Joost also recommends checking for unnecessary or slow database queries, and installing a helpful debugging plugin, in his post on Optimizing WordPress database performance.

Clean out your MySQL database

Delete spam comments

From the Dashboard, click Comments, then click on Empty Spam.

Delete post revisions

If you don’t use post revisions, you may want to delete them from the wp_posts table. Back up your database, then run the following SQL query:

DELETE FROM wp_posts WHERE post_type = "revision";

Before: 683 records
After: 165 records

This does not delete the latest draft of unpublished posts. It’s a good idea to optimize the table after deleting the revisions.

You can stop WordPress from saving post revisions by adding the following code to wp-config.php:

define('WP_POST_REVISIONS', false );

Optimize your MySQL database

Optimizing your MyISAM tables is comparable to defragmenting your hard drive. It’s probably been a while since you’ve done that, too.

If you’re using phpMyAdmin, browse to your WordPress database. Under the Structure tab, at the bottom of the list of tables, click on the link “Check all”. In the “With selected” menu, choose “Optimize table”. (I would have recommended just optimizing tables that have overhead, but the wp_posts table can be optimized even when it doesn’t show any overhead.

For MyISAM tables, OPTIMIZE TABLE works as follows:

  • If the table has deleted or split rows, repair the table.
  • If the index pages are not sorted, sort them.
  • If the table’s statistics are not up to date (and the repair could not be accomplished by sorting the index), update them.

http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html

Flush the Buffer Early

When users request a page, it can take anywhere from 200 to 500ms for the backend server to stitch together the HTML page. During this time, the browser is idle as it waits for the data to arrive. In PHP you have the function flush(). It allows you to send your partially ready HTML response to the browser so that the browser can start fetching components while your backend is busy with the rest of the HTML page. The benefit is mainly seen on busy backends or light frontends.

http://developer.yahoo.com/performance/rules.html#flush

Add flush() between the closing tag and the opening tag in header.php.

</head>
<?php flush(); // http://developer.yahoo.com/performance/rules.html#flush ?>
<body>

OK, so this isn’t technically a WordPress-specific tweak, but it’s a good idea.

As a follow-up to my post on compressing .php, .css and .js files without mod_gzip or mod_deflate, I’m documenting the changes I made to the .htaccess file on ardamis.com in order to speed up page load times for returning visitors and satisfy the Leverage browser caching recommendation of Google’s Page Speed Firefox/Firebug Add-on.

A great explanation of why browser caching helps the web deliver a better user experience is at betterexplained.com.

Two authoritative articles on the subject are Google’s Performance Best Practices | Optimize caching and Yahoo’s Best Practices for Speeding Up Your Web Site | Add an Expires or a Cache-Control Header.

I’d like to point out that in researching browser cashing, I came across a lot of information that contradicted the rather clear instructions from Google:

It is important to specify one of Expires or Cache-Control max-age, and one of Last-Modified or ETag, for all cacheable resources. It is redundant to specify both Expires and Cache-Control: max-age, or to specify both Last-Modified and ETag.

I’m not sure that this recommendation is entirely correct, as the W3C states that Expires and Cache-Control max-age are used in different situations, with Cache-Control max-age overriding Expires in the event of conflicts.

If a response includes both an Expires header and a max-age directive, the max-age directive overrides the Expires header, even if the Expires header is more restrictive. This rule allows an origin server to provide, for a given response, a longer expiration time to an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

It would seem that Cache-Control is the preferred method of controlling browser caching going forward.

HTTP 1.1 clients will honour “Cache-Control” (which is easier to use and much more flexible).
HTTP 1.0 clients will ignore “Cache-Control” but honour “Expires”. With “Expires” you get thus at least a bit control for these old clients.

http://www.peterbe.com/plog/cache-control_or_expires

In any event, Page Speed won’t protest if you do end up sending both Expires and Cache-Control max-age, or if you remove both Last-Modified and ETag, but I was able to get the best results with just setting Cache-Control max-age and removing the ETag.

Setting the headers in .htaccess

On Apache, configuring the proper headers can be done in the .htaccess file, using the Header directive. The Header directive requires the mod_headers module to be enabled.

I’m choosing to set a far future Expires header of one year on my images files, because I tweak the CSS and JavaScript pretty often, and don’t want those file types to be cached as long.

Add the following code to your .htaccess file to set your Cache-Control and Expires headers, adjusting the date to be one year from today.

# Set Cache-Control and Expires headers
<filesMatch "\\.(ico|pdf|flv|jpg|jpeg|png|gif|swf|mp3|mp4)$">
Header set Cache-Control "max-age=2592000, private"
Header set Expires "Sun, 17 July 2011 20:00:00 GMT"
</filesMatch>
<filesMatch "\\.(css|css.gz)$">
Header set Cache-Control "max-age=604800, private"
</filesMatch>
<filesMatch "\\.(js|js.gz)$">
Header set Cache-Control "max-age=604800, private"
</filesMatch>
<filesMatch "\\.(xml|txt)$">
Header set Cache-Control "max-age=216000, private, must-revalidate"
</filesMatch>
<filesMatch "\\.(html|htm)$">
Header set Cache-Control "max-age=7200, private, must-revalidate"
</filesMatch>

Removing ETags in .htaccess

Most sources recommend simply removing ETags if they are not required.

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser’s cache matches the one on the origin server.

If you’re not taking advantage of the flexible validation model that ETags provide, it’s better to just remove the ETag altogether.

http://developer.yahoo.com/performance/rules.html#etags

Add the following code to your .htaccess file to remove ETag headers.

# Turn off ETags
FileETag None
Header unset ETag

Set Expires headers with ExpiresByType (optional)

If your host has the mod_expires module enabled, you can specify Expires headers by file type. Godaddy does not have this module enabled.

# Set Expires headers
ExpiresActive On
ExpiresDefault "access plus 1 year"
ExpiresByType text/html "access plus 1 second"
ExpiresByType image/gif "access plus 2592000 seconds"
ExpiresByType image/jpeg "access plus 2592000 seconds"
ExpiresByType image/png "access plus 2592000 seconds"
ExpiresByType image/x-icon "access plus 2592000 seconds"
ExpiresByType text/css "access plus 604800 seconds"
ExpiresByType text/javascript "access plus 604800 seconds"
ExpiresByType application/x-javascript "access plus 604800 seconds"

Removing the Last-Modified header in .htaccess (optional)

I’m following Google’s instructions and not removing the Last-Modified header, but if you wanted to do so, you could use:

# Remove Last-Modified header
Header unset Last-Modified

Busting the cache when files change

What happens when you change files and need to force browsers to load the new files? Christian Johansen offers two methods in his post on Using a far future expires header.