Tag Archives: mysql

I like Akismet, and it’s undeniably effective in stopping the vast majority of spam, but it adds a huge number of comments to the database and a very small percentage of comments still get through to my moderation queue.

It’s annoying to find comments in my moderation queue, but what I really object to is the thousands of records that are added to the database each month that I don’t see.

In the screenshot below, January through April show very few spam comments being detected by Akismet. This is because I was using my cache-friendly method for reducing WordPress comment spam to block spam comments even before Akismet analyzed them.

Akismet stats

In May, I moved hosting providers to asmallorange.com and started with a fresh install of WordPress without implementing my custom spam method, which admittedly was not ideal because it involved changing core files. This left only Akismet between the spammers and my WordPress database. Since that time, instead of 150 or fewer spam comments per month making it into my WordPress database, Akismet was on pace to let in over 10,000.

So, in the spirit of fresh starts and doing things the right way, I created a WordPress plug-in that uses the same timestamp method. It’s actually exactly the same JavaScript and PHP code, just in plug-in form, so it’s not bound to any core files or theme files.

I just picked up an old Dell Precision 690 workstation, which I intend to develop into a file server, a Windows IIS server, and an Ubuntu LAMP server. This monster was built in 2006, but it still has some neat specs and tons of capacity (7 PCI slots, 4 hard drive bays, etc…), should I want to expand further.

Dell Precision 690

Dell Precision 690 Workstation

The main specs

CPU: Dual Core Intel Xeon 5060 3.2GHz, 4M Cache, 1066 MHz FSB
RAM: 2GB DDR2 PC2-5300, CL=5, Fully Buffered, ECC, DDR2-667
HD: SAS Fujitsu MAX3073RC 73GB, 15000 RPM, 16MB Cache
Video: Nvidia Quadro NVS 285 PCI-Express, 128MB

This is not a normal tower

Right away, the size of this thing suggests it isn’t a normal tower. It’s about up to my knee and weights 70 lbs. It feels like it’s made with heavier gauge steel than the typical chassis, but that may be me projecting.

I immediately shopped around for more RAM, obviously. 2GB seems a little thin, even by 2006 standards, when considering the way everything else is high-end. The mainboard has 8 slots and supports up to 32GB, but I figure 6GB is a safe place to start.

The workstation has three enormous fans, like, big-as-your-hand big. Running it with the chassis open causes some sort of thermal protection system to kick in and it spins the fans up to the point that they were blowing stuff on the floor half-way across the room.

The CPU has a big, passive heat sink with six copper pipes and sits between two of those fans. I’m tempted to buy a second CPU, but I’ll hold off.

I’m still on the fence about the SCSI drive. It should be super fast, but I’m a little spoiled by the SSD in my machine at work, so it’s hard to get excited about a mechanical drive, even one running at 15k RPM.

The Nvidia Quadro card is also fanless, and has a bizarre DMS-59 connector. An adapter converts the DMS-59 connector into two DVI outputs.

For a recent project, I needed to create a form that would perform a look up of people names in a MySQL database, but I wanted to use a single input field. To make it easy on the users of the form, I wanted the input field to accept names in either “Firstname Lastname” or “Lastname, Firstname” format, and I wanted it to autocomplete matches as the users typed, including when they typed both names separated by a space or a comma followed by a space.

The Ajax lookup was quick work with jQuery UI’s Autocomplete widget. The harder part was figuring out the most simple table structure and an appropriate SQL query.

A flawed beginning

My people table contains a “first_name” column and a “last_name” column, nothing uncommon there. To get the project out the door, I wrote a PHP function that ran two ALTER TABLE queries on the people table to create two additional columns for pre-formatted strings (column “firstlast”, to be formatted as “Firstname Lastname”, and column “lastfirst”, to be formatted as “Lastname, Firstname”), added indexes on these columns, and then walked through each record in the table, populating these new fields. I then wrote a very straight forward SQL query to perform a lookup on both fields. The PHP and query looked something like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT * FROM people WHERE (`firstlast` LIKE '%1\$s' OR `lastfirst` LIKE '%1\$s') ORDER BY `lastfirst` ASC",
mysql_real_escape_string($name. "%", $link));

This was effective, accurate, and pretty fast, but the addition of columns bothered me and I didn’t like that I needed to run a process to generate those pre-formatted fields each time a record was added to the table (or if a change was made to an existing record). One possible alternative was to watch the input and match either lastname or firstname until the user entered a comma or a space, then explode the string on the comma or space and search more precisely. Once a comma or a space was encountered, I felt pretty sure that I would be able to accurately determine which part of the input was the first name and which was the last name. But this had that same inefficient, clunky bad-code-smell as the extra columns. (Explode is one of those functions that I try to avoid using.) Writing lots of extra PHP didn’t seem necessary or right.

I’m much more comfortable with PHP than with MySQL queries, but I realize that one can do some amazing things within the SQL query, and that it’s probably faster to use SQL to perform some functions. So, I decided that I’d try to work up a query that solved my problem, rather than write more lines of PHP.

CONCAT_WS to the rescue

I Googled around for a bit and settled on using CONCAT_WS to concatenate the first names and last names into a single string be matched, but found it a bit confusing to work with. I kept trying to use it to create an alias, “lastfirst”, and then use the alias in the WHERE clause, which doesn’t work, or I was getting the literal column names back instead of the values. Eventually, I hit upon the correct usage.

The PHP and query now looks like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT *, CONCAT_WS(  ', ',  `last_name`,  `first_name` ) as lastfirst FROM people WHERE (CONCAT_WS(  ', ',  `last_name`,  `first_name` ) LIKE '%1\$s' OR CONCAT_WS(  ' ',  `first_name`,  `last_name` ) LIKE '%1\$s') ORDER BY lastfirst ASC",
mysql_real_escape_string($name. "%", $link));

The first instance of CONCAT_WS isn’t needed for the lookup. The first instance allows me to order the results alphabetically and provides me an array key of “lastfirst” with a value of the person’s name already formatted as “Lastname, Firstname”, so I don’t have to do it later with PHP. The lookup comes from the two instances of CONCAT_WS in the WHERE clause. I haven’t done any performance measuring here, but the results of the lookup get back to the user plenty fast enough, if not just as quickly as the method using dedicated columns.

The result of the query is output back to the page as JSON-formatted data for use in the jQuery Autocomplete.

The end result works exactly as I had hoped. A user of the form is able to type a person’s name in whatever way is comfortable to them, as “Bob Smith” or “Smith, Bob”, and the matches are found either way. The only thing it doesn’t do is output the matches back to the autocompleter in the same format that the user is using. But I can live with that for now.

While setting up a new MySQL account at a GoDaddy hosted web site, I kept getting an error when logging in to phpMyAdmin.

Error
#1045 – Access denied for user

For things like database usernames/passwords and other things that I’ll never have to remember or type, I like to use a long string of random characters. One excellent source of such strings is GRC’s Ultra High Security Password Generator. I typically use a subset of the 63 random alpha-numeric characters (a-z, A-Z, 0-9) in the bottom box. This gives me a good mix of uppercase, lowercase, and numbers, which satisfies the requirements of most password systems that require even minimum complexity.

So, I picked a string of characters for the database name and a different string for the password (making sure the password contained at least 1 uppercase character and 1 number), pasted them into the config.php file I was going to use on the project and then pasted them into the database setup form and created my database. No problem.

I gave it 10 or 15 minutes to get all set up and then launched phpMyAdmin. I copied and pasted the username and password from my config file into the log in fields and wham, I got the #1045.

After much second guessing and more copying and pasting, all with no luck, I tried resetting the password back in the Hosting Control Center. I waited a few more minutes for good measure and tried again. Still, #1045 – Access denied for user.

Then it was time to Google, which turned up a thread full of people with the same experience at http://community.godaddy.com/groups/web-hosting/forum/topic/mysql-login-error-1045-access-denied-for-user/?sid&sp=1&topic_page=1&num=15.

Back in the Control Center, I noticed that the mixed case characters I’d used for the database/username had been converted to lowercase. So I tried using the lowercase version at phpMyAdmin and still no luck.

I submitted a support ticket, as recommended in the thread, and then called Customer Support for good measure.

The guy confirmed that the database was in good shape and that the last password reset took effect, then had me reset it again. And of course, when I tried to log into phpMyAdmin a moment later with the lowercase username, it went right in.

The fix (or a plausible explanation, at least)

The lesson learned here, is that even though the new MySQL database setup form will accept mixed case characters as the database name/username, it will silently convert them to lowercase on you. The phpMyAdmin login, then, is case sensitive, so you may want to copy and paste from the Control Center into phpMyAdmin to be sure you’re feeding it the correct username.

I’m running XAMPP 1.7.4 [PHP: 5.3.5] (not as a service) on 64-bit Windows 7 Professional.

I installed XAMPP to E:\xampp, and I have pinned the XAMPP Control Panel (xampp-control.exe) to the taskbar for easier access, but starting up xampp-control.exe from that shortcut throws an error:

XAMPP Control

XAMPP Component Status Check failure [3].

Current directory: E:\xampp

Run this program only from your XAMPP root directory.

[OK] [Cancel]

Strangely enough, I even get this error even when running xampp-control.exe from my XAMPP root directory, which really is E:\xampp.

The last post in the thread at http://www.apachefriends.org/f/viewtopic.php?f=16&t=44320&sid=a41029c6a36bbf5b3bb5817f37842340&start=60 offers a simple solution: change the Install_Dir value under HKEY_LOCAL_MACHINE to point to C:\xampp. According to the thread, the error message is due to a bug where the Install_Dir is checked against a hard-coded path on C:\. That may or may not be the case, but the suggested work-around seems to be effective.

Here’s a registry merge for Windows 7 64-bit that will make the change for you.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\xampp]
"Install_Dir"="C:\\xampp"

Now xampp-control.exe launches without the error, and I haven’t noticed anything (PHP, MySQL, etc.) not working because of the bogus path.

So I finally watched The Social Network over the weekend, and it’s made me feel jealous and a bit guilty.

In a meager effort to console myself for so far failing to be a billionaire, I’m assembling the short list of web-application type things I’ve built here.

  1. A dice roller: rollforit. Enter a name, create a room, invite your friends, and start rolling dice. For people who want to play pen and paper, table-top RPG dice games with their distant friends.
  2. A URL shortener: Minifi.de. Minifi.de comes with an API and a bookmarklet. It really works, too! The technical explanation has more details.
  3. A social networking site: Snapbase. Snapbase is a social site that shows you what’s going on in your city or anywhere in the world as pictures are uploaded by your friends and neighbors. The application extracts location information from the EXIF data embedded in images and displays recent images taken near your present location.
  4. A trouble-ticketing system for an IT help desk or technical support center. It’s really pretty extensive, with asset management, user accounts, salted encrypted passwords, and all sorts of nifty things. I really must write a full description of it at some point, but until then, the documentation is the next best thing.
  5. An account-based invoice tracking and access system for grouping invoices according to clients, then sharing invoice history with those clients and allowing them to easily pay outstanding invoices via Paypal.
  6. An account-based invoice access system where clients can view paid and unpaid invoices, and even easily pay an outstanding invoice via Paypal. I actually use this almost every day.
  7. A simple method for protecting a download using a unique URL that can be emailed to authorized users. The URL can be set to expire after a certain amount of time or any number of downloads.
  8. An update to the above download protection script to protect multiple downloads, generate batches of keys, leave notes about who received the key, the ability to specify per-key the allowable number of downloads and age, and some basic reporting.
  9. An HTML auction template generator called Simple Auction Wizard. It helps you create HTML auction templates for eBay, and uses SWFUpload and tinyMCE.

I have another project in the works that promises to be more financially viable, but the most clever thing on that list is Snapbase. It’s in something akin to alpha right now; barely usable. I really wish I had the time to pursue it.

I’ve written a few tutorials lately on how to reduce page load times. While I use Google’s Page Speed Firefox/Firebug plugin for evaluating pages for load times, there are times when I want a second opinion, or want to point a client to a tool. This post is a collection of links to online tools for testing web page performance.

Page Speed Online

http://pagespeed.googlelabs.com/

Google’s wonderful Page Speed tool, once only available as a Firefox browser Add-on, finally arrives as an online tool. Achieving a high score (ardamis.com is a 96/100) should be on every web developer’s list of things to do before the culmination of a project.

Enter a URL and Page Speed Online will run performance tests based on a set of best practices known to reduce page load times.

  • Optimizing caching – keeping your application’s data and logic off the network altogether
  • Minimizing round-trip times – reducing the number of serial request-response cycles
  • Minimizing request overhead – reducing upload size
  • Minimizing payload size – reducing the size of responses, downloads, and cached pages
  • Optimizing browser rendering – improving the browser’s layout of a page

WebPagetest

http://www.webpagetest.org/

WebPagetest is an excellent application for users who want the same sort of detailed reporting that one gets with Page Speed.

  • Load time speed test on first view (cold cache) and repeat view (hot cache), first byte and start render
  • Optimization checklist
  • Enable keep-alive, HTML compression, image compression, cache static content, combine JavaScript and CSS, and use of CDN
  • Waterfall
  • Response headers for each request

Load Impact

http://loadimpact.com/pageanalyzer.php

Load Impact is an online load testing service that lets you load- and stress test your website over the Internet. The page analyzer analyzes your web page performance by emulating how a web browser would load your page and all resources referenced in it. The page and its referenced resources are loaded and important performance metrics are measured and displayed in a load-bar diagram along with other per-resource attributes such as URL, size, compression ratio and HTTP status code.

ByteCheck

http://www.bytecheck.com/

ByteCheck is a super minimal site that return your page’s all-important time to first byte (TTFB). Time to first byte is the time it takes for a browser to start receiving information after it has started to make the request to the server, and is responsible for a visitor’s first impression that a page is fast- or slow-loading.

Web Page Analyzer

http://websiteoptimization.com/services/analyze/

My opinion is that the Web Page Analyzer report is good for beginners without much technical knowledge of things like gzip compression and Expires headers. It’s a bit dated, and is primarily concerned with basics like how many images a page contains. It tells you how fast you can expect your page to load for dial-up visitors, which strikes me as quaint and not particularly useful.

  • Total HTTP requests
  • Total size
  • Total size per object type (CSS, JavaScript, images, etc.)
  • Analysis of number of files and file size as compared to recommended limits

The Performance Grader

http://www.joomlaperformance.com/component/option,com_performance/Itemid,52/

This is another simplistic analysis of a site, like Web Page Analyzer, that returns its analysis in the form of pass/fail grades on about 14 different tests. I expect that it would be useful for developers who want to show a client a third-party’s analysis of their work, if the third-party is not terribly technically savvy.

One unique thing about this tool, though, is that it totals up the size of all images referenced in CSS files (even those that the current page isn’t using).

  • HTML Size
  • Total Size
  • Total Requests
  • Generation Time
  • Number of Hosts
  • Number of Images
  • Size of Images
  • Number of CSS Files
  • Size of CSS Files
  • Number of Script Files
  • Size of Script Files
  • HTML Encoding
  • Valid HTML
  • Frames

In August, 2010, I described a simple method for dramatically reducing the number of spam comments that are submitted to a WordPress blog. The spam comments are rejected before they are checked by Akismet, so they never make it into the database at all.

Now, a few months later, I’m posting a screenshot of the Akismet stats graph from the WordPress dashboard showing the number of spam comments identified by Akismet before and after the system was implemented.

Akismet stats for August - December, 2010

The spike in spam comments detected around November 3rd occurred after an update to WordPress overwrote my altered wp-comments.php file. I replaced the file and the spam dropped back down to single digits per day.

A little over a year ago, I wrote a post about a PHP script I had created for protecting a download using a unique URL. The post turned out to be pretty popular, and many of the comments included requests to extend the script in useful ways. So, I’ve finally gotten around to updating the script to generate multiple URLs (up to 20) at a time, to allow different files to be associated with different keys, and to allow brief notes to be attached to the download key.

I’ve also added a simple page that prints out a list of all of the keys generated the date and time that each key was created, the filename of the download on the server that the key accesses, the number of times the key was used, and any attached note. This should make it easier to generate gobs of keys, drop them into an Excel spreadsheet, and help the files’ owner keep track of who’s getting which file, and how often.

The scripts themselves are a little more involved this time around, but the general idea is the same. A unique key is generated and combined with a URL that allows access to a single file on the server. Share the URL/key instead of the URL to the file itself to allow a visitor to download the file, but not to know the location of the file. The key will be valid for a certain length of time and number of downloads, and will stop working once the first limiting condition is met. This should prevent unauthorized downloading due to people sharing the keys.

How it works

There are six main components to this system:

  1. the MySQL database that holds each key, the key creation date and time, the maximum age of the key, the number of times the key has been used, the maximum times the key may be used, the file associated with the key, and the note attached to the key, if any
  2. a generatekey.php page that generates the keys and outputs the corresponding unique URLs
  3. a download.php page that accepts the key, checks its validity, and either initiates the download or rejects the key as invalid
  4. a report.php page that returns all of the data in the database
  5. a config.php file that contains variables such as number of downloads allowed, the maximum allowable age of the key, and the filenames of the downloads, along with the database connection information
  6. the .zip file(s) to be protected

The files

The files, along with two example downloads, are available for download as a .zip file.

Download the protecting multiple downloads PHP script

The MySQL database

Using whatever method you’re comfortable with, create a new MySQL database named “download” and add the following table:

CREATE TABLE `downloadkeys` (
  `uniqueid` varchar(12) NOT NULL default '',
  `timestamp` INT UNSIGNED,
  `lifetime` INT UNSIGNED,
  `maxdownloads` SMALLINT UNSIGNED, 
  `downloads` SMALLINT UNSIGNED default '0',
  `filename` varchar(60) NOT NULL default '',
  `note` varchar(255) NOT NULL default '',
  PRIMARY KEY (uniqueid)
);

How to use the scripts

The scripts require a little setup before they’re ready to be used, so open config.php in your text editor of choice.

Change the values for $db_host, $db_username, $db_password, $db_name to point to your database.

Set the variable $maxdownloads equal to the maximum number of downloads (actually, the number of page loads).

Set the variable $lifetime equal to the keys’ viable duration in seconds (86400 seconds = 24 hours).

Set the variable $realfilenames to the real names of actual download files on the server as a comma-separated list (this is optional; you can also use a single filename or just leave it as empty double-quotes: “”). If you have more than one file to protect, enter the names as a comma-separated list and the script will create a drop-down menu as the Filename field. If you leave the variable blank, the form will display an empty input box as the Filename field.

Set the variable $fakefilename to anything – this is what the visitor’s file will be named when the download is initiated.

I would strongly recommend renaming generatekey.php, as anyone who can view it will be able to create unlimited numbers of keys, and worse, they’ll be able to see the filenames (if you set them in config.php). I would also recommend that the directory you put these files into, and each directory on your site (/images, /css, /js, etc.), contain an index.html file. This is a simple security measure that will prevent visitors from snooping around a directory and viewing its contents (though access to the directory contents is usually prohibited by a setting on the server).

Place all the PHP scripts and your .zip file(s) into the same directory on your server.

That’s all there is to it. Whenever you want to give someone access to the download, visit the generatekey.php page and fill out the form. It will generate a key code, save it to a database, and print out a unique link that you can copy and paste into an email or whatever. The page that the unique link points to checks to see if the key code is legitimate, then checks to see if the code is less than X hours old, then checks to see if it has been used less than X times. The visitor will get a descriptive message for the first unmet condition and the script will terminate. If all three conditions are met, the download starts automatically.

Errors and issues

Note: The download will not initiate automatically, and will actually be output as text on the page, if the download.php page is changed to send headers or any output to the browser. Be careful when making modifications or incorporating this script into another page.

Check the HTTP headers (Google for an online service that does this, or install the LiveHTTPHeaders Firefox plugin) of the download link. If the script is working correctly, you should see Content-Transfer-Encoding: binary and Content-Type: application/octet-stream in the headers. If you’re getting a page of text instead of the zip file, you’ll probably see Content-Type: text/html.

Example HTTP headers for a correctly working download

If the script is working correctly, the HTTP headers will look something like this:

HTTP/1.1 200 OK
Date: Sun, 20 Jun 2010 13:31:50 GMT
Server: Apache
Cache-Control: must-revalidate, post-check=0, pre-check=0, private
Content-Disposition: attachment; filename="bogus_download_name.zip"
Content-Transfer-Encoding: binary
Pragma: public
Content-Length: 132
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/octet-stream

I’ve finally opened a Twitter account, so you can follow me at http://twitter.com/ardamis. As a social experiment, it’s interesting to watch and discuss, but I haven’t really participated much. On the other hand, I’m very interested in the phenomenon of URL shortening. So, without too much trouble, I put together yet another URL shortening service: Minifi.de (‘minified’, in the jargon). It does pretty much the same thing as tinyurl.com, bit.ly and is.gd – you enter a long URL and it returns a shorter one. I’m about 50% done with the API (it works, so long as the URL valid), so if anyone knows of any clients that allow the user to specify a shortening service, I’d like to test that functionality. I have plans to make it account-based, so that you can track usage statistics and such, but that will only happen after I’m confident that the thing will survive in the wild.

So, feel free to give it a whirl, but know that it’s still in development and that I’ll probably have to wipe the database a few more times before I get it just right.