Tag Archives: programming

In our corporate environment, our Windows 7 workstations can be powered off or restarted remotely in order to deploy updates, patches, or new software. For those of us running virtual machines in VMware Workstation, this means a running guest operating system would experience an abrupt power-off as the host machine is reset. At the very minimum, this causes the ‘Windows was not shut down properly’ message to appear when the guest OS is powered on, and it may cause serious problems with the integrity of the guest OS or the virtual machine files.

I wanted to improve the situation through the use of shutdown/logoff and startup/logon scripts on the host and the vmrun command line utility that ships with VMware Workstation and VMware Server, and I had three goals in mind.

  1. Any running guest OS would be allowed to shut down or suspend before the host powered off
  2. An event would be written to the Application log on the host for each guest that was shut down or suspended
  3. A complementary process would start or resume each guest that was running when the host restarted

The VBScripts are written for use on a 64-bit Windows 7 host.

The challenge of correct timing

I soon ran into a problem when trying to use Local Group Policy to deploy the shutdown/logoff script on my Windows 7 host. The order of events is such that the shutdown/logoff process is halted by the still-running vmware.exe process (the VMware Workstation UI). I’ve added some notes about this behavior to the bottom of the post, but I have not yet solved this problem.

A word about networking

If the network adapter in the guest OS is not reconnected upon resuming from suspend (in Windows, this can be resolved with ipconfig /renew), it may be that the VMware Tools scripts are not running at start up/resume. Disconnecting from the network is a normal process when the VM receives a suspend command with a soft parameter. I have found that I can ensure that the network adapter is reconnected upon resuming by changing the Power Options for the VM to use “Start Up Guest” instead of “Power On”.

The shutdown/logoff script

This is what I came up with for the shutdown/logoff script.

' This script passes the suspend command to each running VMware virtual machine, allowing it to gracefully sleep/hibernate
' It also saves the list of running VMs to a text file in %TEMP%, which may be parsed by a startup/logon script to resume the VMs
' It can be used as a shutdown/logoff script
' //ardamis.com/2012/03/08/managing-vmware-workstation-virtual-machines-with-vbscript/

Option Explicit

Dim objShell, objScriptExec, objFSO, WshShell, strRunCmd
Dim TEMP, strFileName, vmList, objFile, ForWriting, result, textLines, textLine, isFirstLine

'Initialize the objShell
Set objShell = CreateObject("WScript.Shell")

'Execute vmrun and create the list of running virtual machines
Set objScriptExec = objShell.Exec("""C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe"" list")

'Write the list to a variable
vmList = objScriptExec.StdOut.ReadAll()

'Debug
'WScript.Echo vmList

'Initialize the wshShell
Set WshShell = WScript.CreateObject("WSCript.shell")

TEMP = WshShell.ExpandEnvironmentStrings("%TEMP%")

'Enter the path to the file that will hold the names of the running VMs
strFileName = TEMP & "\vms.txt"

'Debug
'WScript.Echo strFileName

'Initialize the objFSO
Set objFSO = CreateObject("Scripting.FileSystemObject")

'Create the file
Set objFile = objFSO.CreateTextFile(strFileName)

'Write the list to the file
objFile.Write vmList

'Close the file
objFile.Close

'Split the list into lines
textLines = Split(vmList,vbCrLf)

'Loop through the lines
For Each textLine in textLines

	'Compare the first line in the file to the text "Total running VMs:"
	isFirstLine = StrComp(Mid(textLine, 1, 18), "Total running VMs:")

	'If the line has more than 0 character (is not blank) and is not the first line
	If Len(textLine) > 0 And isFirstLine <> 0 Then
	
		'Write to the application log
		WshShell.LogEvent 4, "Event: VMware is attempting to suspend the VM at " & textLine

		'Save the command as a variable
		strRunCmd = """C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe"" -T ws suspend """ & textLine & """ soft"

		'Run the command
		result = WshShell.Run(strRunCmd, 0, True)

		'Write to the application log
		If result = 0 Then 
			WshShell.LogEvent 4, "Event: VMware successfully suspended the VM at " & textLine
		Else
			WshShell.LogEvent 1, "Event: VMware was unable to suspend the VM at " & textLine
		End If

'Debug
'WScript.Echo result
		
'Debug
'WScript.Echo textLine

	End If

Next

The vms.txt file that the script creates will contain something like the following, if it finds a running VM:

Total running VMs: 1
C:\Virtual Machines\Windows XP Professional\Windows XP Professional.vmx

I have chosen to suspend the virtual machine, rather than shut it down, because I don’t want to lose any work that may be unsaved. The official explanation of the suspend power command from VMware:

Suspends a virtual machine (.vmx file) or team (.vmtm) without shutting down, so local work can resume later. The soft option suspends the guest after running system scripts. On Windows guests, these scripts release the IP address. On Linux guests, the scripts suspend networking. The hard option suspends the guest without running the scripts. The default is to use the powerType value specified in the .vmx file, if present.
To resume virtual machine operation after suspend, use the start command. On Windows, the IP address is retrieved. On Linux, networking is restarted.
http://www.vmware.com/support/developer/vix-api/vix110_vmrun_command.pdf

The startup/logon script

This is the startup/logo script that compliments the shutdown/logoff script.

' This script reads a list of VMware virtual machines from a text file and passes the start command to each VM, allowing it to resume from sleep/hibernate/shutdown
' It can be used as a startup/logon script
' //ardamis.com/2012/03/08/managing-vmware-workstation-virtual-machines-with-vbscript/

Option Explicit

Dim objFSO, WshShell, strRunCmd
Dim TEMP, strFileName, objTextStream, vmList, ForReading, result, textLines, textLine, isFirstLine

'Initialize the wshShell
Set WshShell = WScript.CreateObject("WSCript.shell")

TEMP = WshShell.ExpandEnvironmentStrings("%TEMP%")

'Enter the path to the text file that will hold the names of the running VMs
strFileName = TEMP & "\vms.txt"

WshShell.LogEvent 4, "Event: VMware is attempting to find a list of VMs to restart in " & strFileName

'Initialize the objFSO
Set objFSO = CreateObject("Scripting.FileSystemObject")

'Check to see if the text file exists
If objFSO.FileExists(strFileName) Then

	'Open the text file
	Set objTextStream = objFSO.OpenTextFile(strFileName, 1)
	
	'Read the contents into a variable
	vmList = objTextStream.ReadAll()

'Debug
'WScript.Echo vmList

	'Close the text file
	objTextStream.Close

	'Split the list into lines
	textLines = Split(vmList,vbCrLf)

	'Loop through the lines
	For Each textLine in textLines

		'Compare the first line in the file to the text "Total running VMs: 0"
		isFirstLine = StrComp(Mid(textLine, 1, 20), "Total running VMs: 0")
		
		'Check to see if the first line of the text file reports 0 running VMs
		If isFirstLine = 0 Then
		
			'Write to the application log
			WshShell.LogEvent 4, "Event: VMware found no running VMs were enumerated in " & strFileName

		End If

		'Compare the first line in the file to the text "Total running VMs:"
		isFirstLine = StrComp(Mid(textLine, 1, 18), "Total running VMs:")
		
		'If the line has more than 0 character (is not blank) and is not the first line
		If Len(textLine) > 0 And isFirstLine <> 0 Then
		
			'Write to the application log
			WshShell.LogEvent 4, "Event: VMware is attempting to start the VM at " & textLine

			'Save the command as a variable
			strRunCmd = """C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe"" -T ws start """ & textLine

			'Run the command
			result = WshShell.Run(strRunCmd, 0, True)

			'Write to the application log
			If result = 0 Then 
				WshShell.LogEvent 4, "Event: VMware successfully started the VM at " & textLine
			Else
				WshShell.LogEvent 1, "Event: VMware was unable to start the VM at " & textLine
			End If

'Debug
'WScript.Echo result
			
'Debug
'WScript.Echo textLine

		End If

	Next

Else
	WshShell.LogEvent 4, "Event: VMware did not find a list of VMs to restart at " & strFileName
End If

This script starts/resumes the virtual machine and launches the Workstation user interface.

Timing of the shutdown/logoff events

Using Group Policy shutdown/logoff scripts seemed a natural way to power off and resume the virtual machines, but there is a timing problem that prevents this from working as desired. Instead of running any logoff scripts immediately when the user chooses to log off, Windows first tries to close any open applications by ending running processes. When it encounters vmware.exe, which is the VMware Workstation GUI, it pauses the log off process and asks the user whether the log off should force the applications to close, or if the log off should be cancelled.

On Windows 7, the screen will dim and the programs that are preventing Windows from logging off the user or shutting down are listed.

Windows 7 - 1 program still needs to close

Windows 7 - VMware Workstation prevents shutdown or logoff

1 program still needs to close:

(Waiting for) [VM name] – VMware Workstation
1 virtual machine is in use.

To close the program that is preventing Windows from logging off, click Cancel, and then close the program.
[Force log off] [Cancel]

As pointed out on the vmware.com community forums, this only happens when the Workstation UI process is running at the time.

We don’t support running Workstation a service. I assume you’re using some third-party tool for that?

Anyway, that error only appears if the Workstation UI is running when you try to log off. If you kill the UI process (vmware.exe) and let the VM run in the background, you shouldn’t get that. Alternatively you could try running VMware Player instead of VMware Workstation a service.
http://communities.vmware.com/message/1189261

Quitting the Workstation process and allowing the scripts to close out the actual VMs seemed like an acceptable compromise. It still required some user interaction on the host to prepare the guest to be powered off, but I figured that there may be ways to end the Workstation UI programatically prior to the logoff.

I decided to consult the Workstation 7.1 manual:

You can set a virtual machine that is powered on to continue running in the background when you close a virtual machine or team tab, or when you exit Workstation. You can still interact with it through VNC or another service.
From the VMware Workstation menu bar, choose Edit > Preferences. On the Workspace tab, select Keep VMs running after Workstation closes and click OK.
http://www.vmware.com/pdf/ws71_manual.pdf

I found that if the VMware Workstation application is already closed, the shutdown/logoff proceeds smoothly and the scripts fire. But there is another problem. By the time the logoff script runs, the vmware-vmx.exe process (the actual virtual machine) has already been quit, so the vmrun list command finds no running VMs and you end up with a vms.txt file that contains this:

Total running VMs: 0

At this point, running VMware Player like a service logged on as the Local System account, which presumably will allow the VMs to continue running even while users on the host log out, becomes the best solution, as it theoretically avoids the problem of a) requiring the user to close the UI and b) the vmware-vmx.exe process being ended as the user logs off. VMware Player is included with Workstation, but we’re not quite out of the woods yet. According to another VMware employee:

VMware Player is not built to run as a service. However, there are different discussions and possible solutions using srvany.exe.
If you google for site:vmware.com srvany player you will find some interesting posts for this issue.
http://communities.vmware.com/message/1595588

I followed through on this suggestion, and while it didn’t solve my problem, I’m including some detail here in the hopes that it will further someone else’s exploration.

To run an application as though it were a service, you need two executables from the Windows Server 2003 Resource Kit Tools:

  • Instsrv.exe: Service Installer
  • Srvany.exe: Applications as Services Utility

The Windows Server 2003 Resource Kit Tools are not officially supported on Windows 7, and in fact the installer will cause the Program Compatibility Assistant to warn that “This program has known compatibility issues”, but my observations seems to support other people’s reports that they work fine.

A Windows .Net magazine article from 2004 referencing Workstation 4.0 is still a good guide to follow in setting this up. My adjustments for using Player are below:

  1. Install the Windows Server 2003 Resource Kit Tools and reboot
  2. Locate srvany.exe (the default location is C:\Program Files (x86)\Windows Resource Kits\Tools\srvany.exe)
  3. Open an elevated command prompt and enter instsrv [service name] [srvany.exe location], using anything you want for the service name (ex: instsrv vmplayer “C:\Program Files (x86)\Windows Resource Kits\Tools\srvany.exe”)
  4. Open an elevated instance of the Windows Services snap-in (services.msc), right-click the newly created service, choose the Log On tab, and check the box next to “Allow service to interact with desktop”
  5. Open an elevated instance of Registry Editor (regedit.exe)
  6. Locate vmplayer.exe (the default location is C:\Program Files (x86)\VMware\VMware Workstation\vmplayer.exe)
  7. Navigate to your service’s key at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\[service name]
  8. Create a new subkey named Parameters under your service’s key
  9. Create a new String Value named Application under the Parameters key
  10. Double-click the Application value and enter the path to vmplayer.exe as the value’s data

You should now be able to start the vmplayer service (or whatever you chose to name it) from the Services snap-in.

But, we’re still not home free. This doesn’t magically allow any instance of VMware Player to persist through a user logoff (which is really what I was hoping to get).

The harsh reality set in when I came across this thread, wherein continuum (a guy with incredible insight into VMware) bursts the vm-as-a-service balloon:

there are 2 ways to run the service …
– run it with “local system account” plus “allow to interact …” checked
– run it as a user – needs the password of this user

In first case the VM starts after a user is logged in – the VM is visible and you can interact with it but you can NOT log off.
It use process vmplayer.exe plus vmware-vmx.exe.

In second case the VM is invisible and only process vmware-vmx.exe runs but no user has to be logged in
http://communities.vmware.com/message/1471897#1471897

What I need is the best of both worlds: a VMware GUI environment, be it Workstation or Player, that is able to load a VM when a user logs into the host, and at the same time is able to keep the VM running while that user logs off.

Ultimately, I’m left with the same feelings as expressed toward the end of the thread at http://communities.vmware.com/message/1402590: why should useful and highly sought-after functionality that is present in the free but no-longer-actively developed Server product be absent from the non-free and actively developed Workstation product?

The answer, if there is one, may be that VMware doesn’t want to get involved.

One of the nastier corner cases is, what happens if there is a failure suspending the VM? Do we decide the user really wanted to log off and forcibly kill the VM, or do we veto the log-off and go back to the user for input (which, if you are using a laptop, means closing the lid leaves the VM running and kills the battery)? What if the VM process crashes during this – who initiates the log-off then? What if the VM is busy doing something expensive (like disk consolidation) and cannot suspend? Getting involved in the log-off path is, realistically, just a mess of bugs.
http://communities.vmware.com/thread/233117

Final thoughts

As with pretty much anything I do, this is far from finished. I’m not ready to give up on the goal of using scripts to start and suspend VMs without any user interaction. But it seems that it’s going to be much more difficult than one might reasonably expect.

As for the scripts themselves, I’m slightly bothered by the empty command prompt window that is opened momentarily by objShell.Exec. I’m not sure that I like saving the list of running VMs to %TEMP%, where it may be deleted by other processes that clean that location at login/logout. But they are a good start, and they seem to serve their purpose.

7:21 PM 2/26/2012

I recently ran the spider at www.xml-sitemaps.com against www.ardamis.com and it returned a list of URLs that included a few pages with some suspicious-looking parameters. This is the second time I’ve come across these URLs, so I decided to document what was going on. The first time, I just cleared the cache, spidered the site to preload the cache, and confirmed that the spider didn’t encounter the pages. And then I forgot all about it. But now I’m mad.

Normally, a URL list for a WordPress site includes the various pages of the site, like so:

//ardamis.com/
//ardamis.com/page/2/
//ardamis.com/page/3/

But in the suspicious URL list, there are additional URLs for the pages directly off of the site’s root.

//ardamis.com/
//ardamis.com/?option=com_google&controller=..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F%2Fproc%2Fself%2Fenviron%0000
//ardamis.com/page/2/
//ardamis.com/page/2/?option=com_google&controller=..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F%2Fproc%2Fself%2Fenviron%0000
//ardamis.com/page/3/
//ardamis.com/page/3/?option=com_google&controller=..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F..%2F%2F%2Fproc%2Fself%2Fenviron%0000

This occurs only for the pagination of the main site’s pages. I did not find URLs containing the parameter ?option=com_google&controller= for any pages that exist under a category or tag, but that also use the /page/2/ convention.

The parameter is the urlencoded version of the text:

?option=com_google&controller=..//..//..//..//..//..//..//..///proc/self/environ00

Exploration

I compared the source code of the pages at the clean URLs vs that of the pages at the bad URLs and found that there was a difference in the pagination code generated by the WP-Paginate plugin.

The good pages had normal-looking pagination links.

<div class="navigation">
<ol class="wp-paginate">
<li><span class="title">Navigation:</span></li>
<li><a href="//ardamis.com/page/2/" class="prev">&laquo;</a></li>
<li><a href='//ardamis.com/' title='1' class='page'>1</a></li>
<li><a href='//ardamis.com/page/2/' title='2' class='page'>2</a></li>
<li><span class='page current'>3</span></li>
<li><a href='//ardamis.com/page/4/' title='4' class='page'>4</a></li>
<li><a href='//ardamis.com/page/5/' title='5' class='page'>5</a></li>
<li><a href='//ardamis.com/page/6/' title='6' class='page'>6</a></li>
<li><a href='//ardamis.com/page/7/' title='7' class='page'>7</a></li>
<li><span class='gap'>...</span></li>
<li><a href='//ardamis.com/page/17/' title='17' class='page'>17</a></li>
<li><a href="//ardamis.com/page/4/" class="next">&raquo;</a></li>
</ol>
</div>    

The bad pages had the suspicious URLs, but were otherwise identical. Other than the URLs in the navigation, there was nothing alarming about the HTML on the bad pages.

I downloaded the entire site and ran a malware scan against the files, which turned up nothing. I also did some full-text searching of the files for the usual base64 decode eval type stuff, but nothing was found. I searched through the tables in my database, but didn’t see any instances of com_google or proc or environ that I could connect to the suspicious URLs.

Google it

Google has turned up a few good links about this problem, including:

  1. http://www.exploitsdownload.com/search/com_/36 – AntiSecurity/Joomla Component Contact Us Google Map com_google Local File Inclusion Vulnerability
  2. http://forums.oscommerce.com/topic/369813-silly-hacker/ – “On a poorly-secured LAMP stack, that would read out your server’s environment variables. That is one step in a process that would grant the hacker root access to your box. Be thankful it’s not working. Hacker is a bad term for this. This is more on the Script Kiddie level.”

    The poster also provided a few lines of code for blocking these URLs in an .htaccess file.

    # Block another hacker
    RewriteCond %{QUERY_STRING} ^(.*)/self/(.*)$ [NC]
    RewriteRule ^.* - [F]
    
  3. http://forums.oscommerce.com/topic/369813-silly-hacker/ – “This was trying for Local File Inclusion vulnerabilities via the Joomla/Mambo script.”
  4. http://core.trac.wordpress.org/ticket/14556 – a bug ticket submitted to WordPress over a year earlier identifying a security hole if the function that generates the pagination isn’t wrapped in a url_esc function that sanitizes the URL. WP-Paginate’s author submits a comment to the thread, and the plugin does use url_esc.

So, what would evidence of an old Joomla exploit be doing on my WordPress site? And what is happening within the WP-Paginate plugin to cause these parameters to appear?

Plugins

It seemed prudent to take a closer look at two of the plugins used on the site.

Ardamis uses the WP-Paginate plugin. The business of generating the /page/2/, /page/3/ URLs is a native WordPress function, so it’s strange to see how those URLs become subject to some sort of injection by way of the WP-Paginate plugin. I tried passing a nonsense parameter in a URL (//ardamis.com/page/3/?foobar) and confirmed that the navigation links created by WP-Paginate contained that ?foobar parameter within each link. This happens on category pages, too. This behavior of adding any parameters passed in the URL to the links it is writing into the page, even if they are urlencoded, is certainly unsettling.

The site also uses the WP Super Cache plugin. While this plugin seems to have been acting up lately, in that it’s not reliably preloading the cache, I can’t make a connection between it and the problem. I also downloaded the cache folder and didn’t see cached copies of these URLs. I turned off caching in WP Super Cache but left the plugin activated, cleared the cache, and then sent the spider against the site again. This time, the URL list didn’t contain any of the bad URLs. Otherwise, the lists were identical. I re-enabled the plugin, attempted to preload the cache (it got through about 70 pages and then stopped), and then ran a few spiders against the site to finish up the preloading. I generated another URL list and the bad URLs didn’t appear in it, either.

A simple fix for the WP-Paginate behavior

The unwanted behavior of the WP-Paginate plugin can be corrected by changing a few lines of code to strip off the GET parameters from the URL. The lines to be changed all reference the function get_pagenum_link. I’m wrapping that function in the string tokenizing function strtok to strip the question mark and everything that follows.

The relevant snippets of the plugin are below.

			
$prevlink = ($this->type === 'posts')
? esc_url(strtok(get_pagenum_link($page - 1), '?'))
: get_comments_pagenum_link($page - 1);
$nextlink = ($this->type === 'posts')
? esc_url(strtok(get_pagenum_link($page + 1), '?'))
: get_comments_pagenum_link($page + 1);
			
function paginate_loop($start, $max, $page = 0) {
    $output = "";
    for ($i = $start; $i <= $max; $i++) {
        $p = ($this->type === 'posts') ? esc_url(strtok(get_pagenum_link($i), '?')) : get_comments_pagenum_link($i);
        $output .= ($page == intval($i))
        ? "<li><span class='page current'>$i</span></li>"
        : "<li><a href='$p' title='$i' class='page'>$i</a></li>";
    }
    return $output;
}

Once these changes are made, WP-Paginate will no longer insert any passed GET parameters into the links it’s writing into that page.

Bandaid

The change to the WP-Paginate plugin is what we tend to call a bandaid – it doesn’t fix the problem, it just suppresses the symptom.

I’ve found that once the site picks up the bad URLs, they can be temporarily cleaned by clearing the cache and then using a spider to recreate it. The only thing left to do is determine where they are coming from in the first place.

The facts

Let’s pause to review the facts.

  1. The http://www.xml-sitemaps.com spider sent against //ardamis.com discovers pages with odd parameters that shouldn’t be naturally occurring on the pages
  2. The behavior of the WP-Paginate plugin is to accept any parameters passed and tack them onto the URLs it is generating
  3. Deleting the cached pages created by WP Super Cache and respidering produces a clean list – the bad URLs are absent

So how is the spider finding pages with these bad URLs? How are they first getting added to a page on the site? It would seem likely that they are originating only on the home page, and the absence of the parameters on other pages that use pagination seems to support that theory.

An unsatisfying ending

Well, the day is over. I’ve added my updated WP-Paginate plugin to the site, so hopefully Ardamis has seen the last of the problem, but I’m deeply unsatisfied that I haven’t been able to get to the root cause. I’ve scoured the site and the database, and I can’t find any evidence of the URLs anywhere. If the bad URLs come back again, I’ll not be so quick to clean up the damage, and will instead try to preserve it long enough to make a determination as to their origin.

Update 07 April 2012: It’s happened again. When I spider the site, two pages have the com_google URL. These page have the code appended to the end of the URL created by the WordPress function cancel_comment_reply_link(). This function generates the anchor link in the comments area with an ID of cancel-comment-reply-link. This time, though, I see the hijacked URL used in the link even when I visit the clean URL of the page.

This code is somehow getting onto the site in such a way that it only shows up in the WP Super Cache’d pages. Clearing the cache and revisiting the page returns a clean page. My suspicion is that someone is visiting my pages with the com_google code as part of the URL. WordPress puts the code into a self-referencing link in the comment area. WP Super Cache then updates the cache with this page. I don’t think WordPress can help but work this way with nested comments, but WP Super Cache should know better than to create a cached page from anything but the content from the server.

In the end, because I wasn’t using nested comments to begin with, I chose to remove the block of code that was inserting the link from my theme’s comments.php file.

    <div class="cancel_comment_reply">
        <small><?php cancel_comment_reply_link(); ?></small>
    </div>

I expect that this will be the last time I find this type of exploit on ardamis.com, as I don’t think there is any other mechanism that will echo out on the page the contents of a parameter passed in the URL.

For a recent project, I needed to create a form that would perform a look up of people names in a MySQL database, but I wanted to use a single input field. To make it easy on the users of the form, I wanted the input field to accept names in either “Firstname Lastname” or “Lastname, Firstname” format, and I wanted it to autocomplete matches as the users typed, including when they typed both names separated by a space or a comma followed by a space.

The Ajax lookup was quick work with jQuery UI’s Autocomplete widget. The harder part was figuring out the most simple table structure and an appropriate SQL query.

A flawed beginning

My people table contains a “first_name” column and a “last_name” column, nothing uncommon there. To get the project out the door, I wrote a PHP function that ran two ALTER TABLE queries on the people table to create two additional columns for pre-formatted strings (column “firstlast”, to be formatted as “Firstname Lastname”, and column “lastfirst”, to be formatted as “Lastname, Firstname”), added indexes on these columns, and then walked through each record in the table, populating these new fields. I then wrote a very straight forward SQL query to perform a lookup on both fields. The PHP and query looked something like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT * FROM people WHERE (`firstlast` LIKE '%1\$s' OR `lastfirst` LIKE '%1\$s') ORDER BY `lastfirst` ASC",
mysql_real_escape_string($name. "%", $link));

This was effective, accurate, and pretty fast, but the addition of columns bothered me and I didn’t like that I needed to run a process to generate those pre-formatted fields each time a record was added to the table (or if a change was made to an existing record). One possible alternative was to watch the input and match either lastname or firstname until the user entered a comma or a space, then explode the string on the comma or space and search more precisely. Once a comma or a space was encountered, I felt pretty sure that I would be able to accurately determine which part of the input was the first name and which was the last name. But this had that same inefficient, clunky bad-code-smell as the extra columns. (Explode is one of those functions that I try to avoid using.) Writing lots of extra PHP didn’t seem necessary or right.

I’m much more comfortable with PHP than with MySQL queries, but I realize that one can do some amazing things within the SQL query, and that it’s probably faster to use SQL to perform some functions. So, I decided that I’d try to work up a query that solved my problem, rather than write more lines of PHP.

CONCAT_WS to the rescue

I Googled around for a bit and settled on using CONCAT_WS to concatenate the first names and last names into a single string be matched, but found it a bit confusing to work with. I kept trying to use it to create an alias, “lastfirst”, and then use the alias in the WHERE clause, which doesn’t work, or I was getting the literal column names back instead of the values. Eventually, I hit upon the correct usage.

The PHP and query now looks like this:

// The jQuery UI Autocomplete widget passes the user input as a value for the parameter "name"
$name= $_GET['name'];

// This SQL query uses argument swapping
$query = sprintf("SELECT *, CONCAT_WS(  ', ',  `last_name`,  `first_name` ) as lastfirst FROM people WHERE (CONCAT_WS(  ', ',  `last_name`,  `first_name` ) LIKE '%1\$s' OR CONCAT_WS(  ' ',  `first_name`,  `last_name` ) LIKE '%1\$s') ORDER BY lastfirst ASC",
mysql_real_escape_string($name. "%", $link));

The first instance of CONCAT_WS isn’t needed for the lookup. The first instance allows me to order the results alphabetically and provides me an array key of “lastfirst” with a value of the person’s name already formatted as “Lastname, Firstname”, so I don’t have to do it later with PHP. The lookup comes from the two instances of CONCAT_WS in the WHERE clause. I haven’t done any performance measuring here, but the results of the lookup get back to the user plenty fast enough, if not just as quickly as the method using dedicated columns.

The result of the query is output back to the page as JSON-formatted data for use in the jQuery Autocomplete.

The end result works exactly as I had hoped. A user of the form is able to type a person’s name in whatever way is comfortable to them, as “Bob Smith” or “Smith, Bob”, and the matches are found either way. The only thing it doesn’t do is output the matches back to the autocompleter in the same format that the user is using. But I can live with that for now.

Update 2015-01-02: About a month ago, in early December, 2014, Google announced that it was working on a new anti-spam API that is intended to replace the traditional CAPTCHA challenge as a method for humans to prove that they are not robots. This is very good news.
This week, I noticed that Akismet is adding a hidden input field to the comment form that contains a timestamp (although the plugin’s PHP puts the initial INPUT element within a P element set to DISPLAY:NONE, when the plugin’s JavaScript updates the value with the current timestamp, the INPUT element jumps outside of that P element). The injected code looks something like this:
<input type=”hidden” id=”ak_js” name=”ak_js” value=”1420256728989″>
I haven’t yet dug into the Akismet code to discover what it’s doing with the timestamp, but I’d be pleased if Akismet is attempting to differentiate humans from bots based on behavior.
Update 2015-01-10: To test the effectiveness of the current version of Akismet, I disabled the anti-spam plugin described in this post on 1/2/2015 and re-enabled it on 1/10/2015. In the span of 8 days, Akismet identified 1,153 spam comments and missed 15 more. These latest numbers continue to support my position that Akismet is not enough to stop spam comments.

In the endless battle against WordPress comment spam, I’ve developed and then refined a few different methods for preventing spam from getting to the database to begin with. My philosophy has always been that a human visitor and a spam bot behave differently (after all, the bots we’re dealing with are not Nexus-6 model androids here), and an effective spam-prevention method should be able to recognize the differences. I also have a dislike for CAPTCHA methods that require a human visitor to prove, via an intentionally difficult test, that they aren’t a bot. The ideal method, I feel, would be invisible to a human visitor, but still accurately identify comments submitted by bots.

Spam on ardamis.com in early 2012 - before and after

Spam on ardamis.com - before and after

A brief history of spam fighting

The most successful and simple method I found was a server-side system for reducing comment spam by using a handshake method involving timestamps on hidden form fields that I implemented in 2007. The general idea was that a bot would submit a comment more quickly than a human visitor, so if the comment was submitted too soon after the post page was loaded, the comment was rejected. A human caught in this trap would be able to click the Back button on the browser, wait a few seconds, and resubmit. This proved to be very effective on ardamis.com, cutting the number of spam comments intercepted by Akismet per day to nearly zero. For a long time, the only problem was that it required modifying a core WordPress file: wp-comments-post.php. Each time WordPress was updated, the core file was replaced. If I didn’t then go back and make my modifications again, I would lose the spam protection until I made the changes. As it became easier to update WordPress (via a single click in the admin panel) and I updated it more frequently, editing the core file became more of a nuisance.

A huge facepalm

When Google began weighting page load times as part of its ranking algorithm, I implemented the WP Super Cache caching plugin on ardamis.com and configured it to use .htaccess and mod_rewrite to serve cache files. Page load times certainly decreased, but the amount of spam detected by Akismet increased. After a while, I realized that this was because the spam bots were submitting comments from static, cached pages, and the timestamps on those pages, which had been generated server-side with PHP, were already minutes old when the page was requested. The form processing script, which normally rejects comments that are submitted too quickly to be written by a human visitor, happily accepted the timestamps. Even worse, a second function of my anti-spam method also rejected comments that were submitted 10 minutes or more after the page was loaded. Of course, most of the visitors were being served cached pages that were already more than 10 minutes old, so even legitimate comments were being rejected. Using PHP to generate my timestamps obviously was not going to work if I wanted to keep serving cached pages.

JavaScript to the rescue

Generating real-time timestamps on cached pages requires JavaScript. But instead of a reliable server clock setting the timestamp, the time is coming from the visitor’s system, which can’t be trusted to be accurate. Merely changing the comment form to use JavaScript to generate the first timestamp wouldn’t work, because verifying a timestamp generated on the client-side against one generated server-side would be disastrous.

Replacing the PHP-generated timestamps with JavaScript-generated timestamps would require substantial changes to the system.

Traditional client-side form validation using JavaScript happens when the form is submitted. If the validation fails, the form is not submitted, and the visitor typically gets an alert with suggestions on how to make the form acceptable. If the validation passes, the form submission continues without bothering the visitor. To get our two timestamps, we can generate a first timestamp when the page loads and compare it to a second timestamp generated when the form is submitted. If the visitor submits the form too quickly, we can display an alert showing the number of seconds remaining until the form can be successfully submitted. This client-side validation should hopefully be invisible to most visitors who choose to leave comments, but at the very least, far less irritating than a CAPTCHA system.

It took me two tries to get it right, but I’m going to discuss the less successful method first to point out its flaws.

Method One (not good enough)

Here’s how the original system flowed.

  1. Generate a first JS timestamp when the page is loaded.
  2. Generate a second JS timestamp when the form is submitted.
  3. Before the form contents are sent to the server, compare the two timestamps, and if enough time has passed, write a pre-determined passcode to a hidden INPUT element, then submit the form.
  4. After the form contents are sent to the server, use server-side logic to verify that the passcode is present and valid.

The problem was that it seemed that certain bots could parse JavaScript enough to drop the pre-determined passcode into the hidden form field before submitting the form, circumventing the timestamps completely and defeating the system.

Because the timestamps were only compared on the client-side, it also failed to adhere to one of the basic tenants of form validation – that the input must be checked on both the client-side and the server-side.

Method Two (better)

Rather than having the server-side validation be merely a check to confirm that the passcode is present, method two compares the timestamps a second time on the server side. Instead of a single hidden input, we now have two – one for each timestamp. This is intended to prevent a bot from figuring out the ultimate validation mechanism by simply parsing the JavaScript. Finally, the hidden fields are not in the HTML of the page when it’s sent to the browser, but are added to the form via jQuery, which makes it easier to implement and may act as another layer of obfuscation.

  1. Generate a first JS timestamp when the page is loaded and write it to a hidden form field.
  2. Generate a second JS timestamp when the form is submitted and write it to a hidden form field.
  3. Before the form contents are sent to the server, compare the two timestamps, and if enough time has passed, submit the form (client-side validation).
  4. On the form processing page, use server-side logic to compare the timestamps a second time (server-side validation).

This timestamp handshake works more like it did in the proven-effective server-side-only method. We still have to pass something from the comment form to the processing script, but it’s not too obvious from the HTML what is being done with it. Furthermore, even if a bot suspects that the timestamps are being compared, there is no telling from the HTML what the threshold is for distinguishing a valid comment from one that is invalid. (The JavaScript could be parsed by a bot, but the server-side check cannot be, making it possible to require a slightly longer amount of time to elapse in order to pass the server-side check.)

The same downside plagued me

For a long time, far longer than I care to admit, I stubbornly continued to modify the core file wp-comments-post.php to provide the server-side processing. But creating the timestamps and parsing them with a plug-in turned out to be a simple matter of two functions, and in June of 2013 I finally got around to doing it the right way.

The code

The plugin, in all its simplicity, is only 100 lines. Just copy this code into a text editor, save it as a .php file (the name isn’t important) and upload it to the /wp-content/plugins directory and activate it. Feel free to edit it however you like to suit your needs.

<?php

/*
Plugin Name: Timestamp Comment Filter
Plugin URI: //ardamis.com/2011/08/27/a-cache-proof-method-for-reducing-comment-spam/
Description: This plugin measures the amount of time between when the post page loads and the comment is submitted, then rejects any comment that was submitted faster than a human probably would or could.
Version: 0.1
Author: Oliver Baty
Author URI: //ardamis.com

    Copyright 2013  Oliver Baty  (email : obbaty@gmail.com)

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
*/

// http://wordpress.stackexchange.com/questions/6723/how-to-add-a-policy-text-just-before-the-comments
function ard_add_javascript(){

	?>
	
<script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
    ardGenTS1();
});
 
function ardGenTS1() {
    // prepare the form
    $('#commentform').append('<input type="hidden" name="ardTS1" id="ardTS1" value="1" />');
    $('#commentform').append('<input type="hidden" name="ardTS2" id="ardTS2" value="1" />');
    $('#commentform').attr('onsubmit', 'return validate()');
    // set a first timestamp when the page loads
    var ardTS1 = (new Date).getTime();
    document.getElementById("ardTS1").value = ardTS1;
}
 
function validate() {
    // read the first timestamp
    var ardTS1 = document.getElementById("ardTS1").value;
//  alert ('ardTS1: ' + ardTS1);
    // generate the second timestamp
    var ardTS2 = (new Date).getTime();
    document.getElementById("ardTS2").value = ardTS2;
//  alert ('ardTS2: ' + document.getElementById("ardTS2").value);
    // find the difference
    var diff = ardTS2 - ardTS1;
    var elapsed = Math.round(diff / 1000);
    var remaining = 10 - elapsed;
//  alert ('diff: ' + diff + '\n\n elapsed:' + elapsed);
    // check whether enough time has elapsed
    if (diff > 10000) {
        // submit the form
        return true;
    }else{
        // display an alert if the form is submitted within 10 seconds
        alert("This site is protected by an anti-spam feature that requires 10 seconds to have elapsed between the page load and the form submission. \n\n Please close this alert window.  The form may be resubmitted successfully in " + remaining + " seconds.");
        // prevent the form from being submitted
        return false;
    }
}
</script>
	
	<?php
}

add_action('comment_form_before','ard_add_javascript');

// http://wordpress.stackexchange.com/questions/89236/disable-wordpress-comments-api
function ard_parse_timestamps(){

	// Set up the elapsed time, in miliseconds, that is the threshold for determining whether a comment was submitted by a human
	$intThreshold = 10000;
	
	// Set up a message to be displayed if the comment is blocked
	$strMessage = '<strong>ERROR</strong>:  this site uses JavaScript validation to reduce comment spam by rejecting comments that appear to be submitted by an automated method.  Either your browser has JavaScript disabled or the comment appeared to be submitted by a bot.';
	
	$ardTS1 = ( isset($_POST['ardTS1']) ) ? trim($_POST['ardTS1']) : 1;
	$ardTS2 = ( isset($_POST['ardTS2']) ) ? trim($_POST['ardTS2']) : 2;
	$ardTS = $ardTS2 - $ardTS1;
	 
	if ( $ardTS < $intThreshold ) {
	// If the difference of the timestamps is not more than 10 seconds, exit
		wp_die( __($strMessage) );
	}
}
add_action('pre_comment_on_post', 'ard_parse_timestamps');

?>

That’s it. Not so bad, right?

Final thoughts

The screen-shot at the beginning of the post shows the number of spam comments submitted to ardamis.com and detected by Akismet each day from the end of January, 2012, to the beginning of March, 2012. The dramatic drop-off around Jan 20 was when I implemented the method described in this post. The flare-up around Feb 20 was when I updated WordPress and forgot to replace the modified core file for about a week, illustrating one of the hazards of changing core files.

If you would rather not add any hidden form fields to the comment form, you could consider appending the two timestamps to the end of the comment_post_ID field. Because its contents are cast as an integer in wp-comments-post.php when value of the $comment_post_ID variable is set, WordPress won’t be bothered by the extra data at the end of the field, so long as the post ID comes first and is followed by a space. You could then just explode the contents of the comment_post_ID field on the space character, then compare the last two elements of the array.

If you don’t object to meddling with a core file in order to obtain a little extra protection, you can rename the wp-comments-post.php file and change the path in the comment form’s action attribute. I’ve posted logs showing that some bots just try to post spam directly to the wp-comments-post.php file, so renaming that file is an easy way to cut down on spam. Just remember to come back and delete the wp-comments-post.php file each time you update WordPress.

Overview of classes and objects

Objects are the building blocks of the application (ie: the workers in a factory)
Classes can be thought of as blueprints for the objects. Classes describe the objects, which are created in memory.
So, the programmer writes the classes and the PHP interpreter creates the objects from the classes.

A class may contain both variables and functions.
A variable inside a class is called a property.
A function inside a class is called a method.

Instantiation

To create an object, you instantiate a class (you create an instance of the class as an object).
For example, if we have a class named ‘person’ and want to instantiate it as the variable $oliver:

$oliver = new person();

The variable $oliver is referred to as the ‘handle’.

Accessing properties and methods

To access the properties and methods of a class, we use the object’s handle, followed by the arrow operator “->”.
For example, if our class has a method ‘get_name’, we can echo that to the page with:

echo $oliver->get_name();

Note that there are no single or double quotes used in instantiating a class or accessing properties and methods of a class.

Constructors

A class may have a special method called a constructor. The constructor method is called automatically when the object is instantiated.
The constructor method begins with two underscores and the word ‘construct’:

function __construct($variable) { }

One can pass values to the constructor method by providing arguments after the class name.
For example, to pass the name “John Doe” to the constructor method in the ‘person’ class:

$john = new person("John Doe");

! If a constructor exists and expects arguments, you must instantiate the class with the arguments expected by the constructor.

Access modifiers and visibility declarations

Properties must, and methods may, have one of three access modifiers (visibility declarations): public, protected, and private.
Public: can be accessed from outside the class, eg: $myclass->secret_variable;
Protected: can be accessed within the class and by classes derived from the class
Private: can be accessed only within the class

Declaring a property with var makes the property public.

Methods declared without an explicit access modifier are considered public.

! If you call a protected method from outside the class, any PHP output before the call is still processed, but you get an error message when the interpreter gets to that call:

Fatal error: Call to protected method...

Inheritance

Inheritance allows a child class to be created from a parent class, whereby the child has all of the public and protected properties and methods of the parent.

A child class extends a parent class:

class employee extends person {
}

A child class can redefine/override/replace a method in the parent class by reusing the method name.

! A child class’s method’s access modifier can not be more restrictive than that of the parent class. For example, if the parent class has a public set_name() method and the child class’s set_name() method is protected, the class itself will generate a fatal error, and no prior PHP output will be rendered. (In the error below, employee is the child class to person):

Fatal error: Access level to employee::set_name() must be public (as in class person) in E:xampphtdocstesteroopclass_lib.php on line 38

To differentiate between a method in a parent class vs the method as redefined in a child class, one must specifically name the class that contains the method you want to call using the scope resolution operator (::):

person::set_name($new_name);

The scope resolution operator allows access to static, constant, and overridden properties or methods of a class, generally, a parent class. This would be done inside the child class, after redefining a parent’s method of the same name.

It’s also possible to use ‘parent’ to refer to the child’s parent class:

parent::set_name($new_name);

(I’m still a bit vague on this and am looking for examples of situations in which this would be used.)

Classes inside classes

Just as it’s possible to instantiate a class and use the object in a view file, it’s possible to instantiate an object and call its methods from inside another class.

Static properties and methods

Declaring class properties or methods as static makes them accessible without needing an instantiation of the class. A property declared as static can not be accessed with an instantiated class object (though a static method can).

Resources
http://us2.php.net/manual/en/language.oop5.php
http://net.tutsplus.com/tutorials/php/oop-in-php/
http://www.phpfreaks.com/tutorial/oo-php-part-1-oop-in-full-effect
http://www.killerphp.com/tutorials/object-oriented-php/

So I finally watched The Social Network over the weekend, and it’s made me feel jealous and a bit guilty.

In a meager effort to console myself for so far failing to be a billionaire, I’m assembling the short list of web-application type things I’ve built here.

  1. A dice roller: rollforit. Enter a name, create a room, invite your friends, and start rolling dice. For people who want to play pen and paper, table-top RPG dice games with their distant friends.
  2. A URL shortener: Minifi.de. Minifi.de comes with an API and a bookmarklet. It really works, too! The technical explanation has more details.
  3. A social networking site: Snapbase. Snapbase is a social site that shows you what’s going on in your city or anywhere in the world as pictures are uploaded by your friends and neighbors. The application extracts location information from the EXIF data embedded in images and displays recent images taken near your present location.
  4. A trouble-ticketing system for an IT help desk or technical support center. It’s really pretty extensive, with asset management, user accounts, salted encrypted passwords, and all sorts of nifty things. I really must write a full description of it at some point, but until then, the documentation is the next best thing.
  5. An account-based invoice tracking and access system for grouping invoices according to clients, then sharing invoice history with those clients and allowing them to easily pay outstanding invoices via Paypal.
  6. An account-based invoice access system where clients can view paid and unpaid invoices, and even easily pay an outstanding invoice via Paypal. I actually use this almost every day.
  7. A simple method for protecting a download using a unique URL that can be emailed to authorized users. The URL can be set to expire after a certain amount of time or any number of downloads.
  8. An update to the above download protection script to protect multiple downloads, generate batches of keys, leave notes about who received the key, the ability to specify per-key the allowable number of downloads and age, and some basic reporting.
  9. An HTML auction template generator called Simple Auction Wizard. It helps you create HTML auction templates for eBay, and uses SWFUpload and tinyMCE.

I have another project in the works that promises to be more financially viable, but the most clever thing on that list is Snapbase. It’s in something akin to alpha right now; barely usable. I really wish I had the time to pursue it.

While making changes to my WordPress theme, I noticed that the error_log file in my theme folder contained dozens of PHP Fatal error lines:

...
[01-Jun-2011 14:25:15] PHP Fatal error:  Call to undefined function  get_header() in /home/accountname/public_html/ardamis.com/wp-content/themes/ars/index.php on line 7
[01-Jun-2011 20:58:23] PHP Fatal error:  Call to undefined function  get_header() in /home/accountname/public_html/ardamis.com/wp-content/themes/ars/index.php on line 7
...

The first seven lines of my theme’s index.php file:

<?php ini_set('display_errors', 0); ?>
<?php
/**
 * @package WordPress
 * @subpackage Ars_Theme
*/
get_header(); ?>

I realized that the error was being generated each time that my theme’s index.php file was called directly, and that the error was caused by the theme’s inability to locate the WordPress get_header function (which is completely normal). Thankfully, the descriptive error wasn’t being output to the browser, but was only being logged to the error_log file, due to the inclusion of the ini_set(‘display_errors’, 0); line. I had learned this the hard way a few months ago when I found that calling the theme’s index.php file directly would generate an error message, output to the browser, that would reveal my hosting account username as part of the absolute path to the file throwing the error.

I decided the best way to handle this would be to check to see if the file could find the get_header function, and if it could not, simply redirect the visitor to the site’s home page. The code I used to do this:

<?php ini_set('display_errors', 0); ?>
<?php
/**
* @package WordPress
* @subpackage Ars_Theme
*/
if (function_exists('get_header')) {
	get_header();
}else{
    /* Redirect browser */
    header("Location: http://" . $_SERVER['HTTP_HOST'] . "");
    /* Make sure that code below does not get executed when we redirect. */
    exit;
}; ?>

So there you have it. No more fatal errors due to get_header when loading the WordPress theme’s index.php file directly. And if something else in the file should throw an error, ini_set(‘display_errors’, 0); means it still won’t be sent to the browser.

Just a few notes to myself about monitoring web sites for infections/malware and potential vulnerabilities.

Tools for detecting infections on web sites

Google Webmaster Tools

Your first stop should be here, as I’ve personally witnessed alerts show up in Webmaster Tools, even when all the following tools gave the site a passing grade. If your site is registered here, and Google finds weird pages on your site, an alert will appear. You can also have the messages forwarded to your email account on file, by choosing the Forward option under the All Messages area of the Home page.

Google Webmaster Tools Hack Alert

Google Safe Browsing

The Google Safe Browsing report for ardamis.com: http://safebrowsing.clients.google.com/safebrowsing/diagnostic?site=ardamis.com

Norton Safe Web

https://safeweb.norton.com/

The Norton Safe Web report for ardamis.com: https://safeweb.norton.com/report/show?url=ardamis.com

Tools for analyzing a site for vulnerabilities

Sucuri Site Check

http://sitecheck.sucuri.net/scanner/

The Sucuri report for ardamis.com: http://sitecheck.sucuri.net/scanner/?scan=www.ardamis.com.

Nearly a year ago, I wrote a post on how to detect and fix Word add-in problems with a macro and batch file, in a Windows XP and Office 2007 environment.

This was sufficiently effective, but it was also overly complicated, requiring four separate components:

  1. an autoexec Word 2007 macro that runs each time Word is opened
  2. a batch file that runs the registry merge file and writes an entry to a log file
  3. the registry merge file that contains the correct LoadBehavior settings for the add-ins
  4. a text file that acts as a log

This month, I decided to rewrite the macro to handle the registry changes and write to the log file. It was also a good opportunity to dig a bit deeper into VBA, and I also wanted to confirm that it would work in a more modern environment of Windows 7 and Office 2010 (that code is near the bottom of the post). The new system has only two components:

  1. an autoexec Word 2007 macro that runs each time Word is opened
  2. a text file that acts as a log

Background

First, a bit of background.

Many of the problems with Word 2007 are due to Word’s handling of add-ins. When something unexpected happens in Word, and Word attributes the problem to an add-in, Word will react by flagging it and prompting the user for a decision the next time Word opens. Depending on the severity of the problem and the user’s response, the add-in can be either ‘hard-disabled’ or ‘soft-disabled’.

Microsoft explains the differences between Hard Disabled vs Soft Disabled in a MSDN article at: http://msdn.microsoft.com/en-us/library/ms268871(VS.80).aspx.

I’ve explained a bit about the process by which Word disables add-ins at the end of this post, and I’ve written a shorter post about the basics behind the registry keys responsible for disabling add-ins.

Handling disabled add-ins programmatically

A Word macro can access the condition of an add-in via an Application.COMAddIns object, and it can read and write to the registry. This allows us to tell when an add-in has been disabled and re-enabled it.

My macro has some admittedly hackish parts that need to be cleaned up, there is the matter of unsetting variables to be addressed, and it could certainly be made more elegant, but it works. Note that a file named addinslog.txt must exist in the %TEMP% directory in order for the macro to write the log file. This is what the Word 2007 macro looks like, using the COM add-in installed with Adobe Acrobat 8 Standard as the required add-in…

Option Explicit

' Set up a function to search for a key and return true or false
Public Function KeyExists(key)
    Dim objShell
    On Error Resume Next
    Set objShell = CreateObject("WScript.Shell")
        objShell.RegRead (key)
    Set objShell = Nothing
    If Err = 0 Then KeyExists = True
End Function
    

Sub AutoExec()
'
' FixMissingAddins Macro
' Display a message box with any critical but not 'Connected' COM add-ins, then fix them programatically
'
' Oliver Baty
' June, 2010 - April, 2011
'
' Information on the Application.COMAddIns array
' http://msdn.microsoft.com/en-us/library/aa831759(v=office.10).aspx
'
' Running macros automatically
' http://support.microsoft.com/kb/286310
'
' Using Windows Scripting Shell (WshShell) to read from and write to the local registry
' http://technet.microsoft.com/en-us/library/ee156602.aspx

   
' Declare the WshShell variable (this is used to edit the registry)
    Dim WshShell
    
' Declare the fso and logFile variables (these are used to write to a txt file)
    Dim fso
    Dim logFile

' Create an instance of the WScript Shell object
    Set WshShell = CreateObject("WScript.Shell")
   
' Declare some other variables
   Dim MyAddin As COMAddIn
   Dim stringOfAddins As String
   Dim listOfDisconnectedAddins As String
   Dim requiredAddIn As Variant
   Dim msg As String

   
' Notes on deleting registry keys and values in VB
' http://www.vbforums.com/showthread.php?t=425483
' http://www.tek-tips.com/viewthread.cfm?qid=674375
' http://www.robvanderwoude.com/vbstech_registry_wshshell.php

' Create a string containing the names of all 'Connected' COM add-ins named "stringOfAddins"
   For Each MyAddin In Application.COMAddIns
      If MyAddin.Connect = True Then
          stringOfAddins = stringOfAddins & MyAddin.ProgID & " - "
      End If
   Next
   
' Create an array to hold the names of the critical (required) add-ins named "requiredAddIns"
' Example: change to "Dim requiredAddIns(0 To 4)" if the macro is checking 5 total add-ins)
   Dim requiredAddIns(0 To 0) As String
   
' Add each required AddIn to the array
   requiredAddIns(0) = "PDFMaker.OfficeAddin"
'   requiredAddIns(1) = ""
'   requiredAddIns(2) = ""
'   requiredAddIns(3) = ""
'   requiredAddIns(4) = ""
   
' Cycle through the array of required add-ins, and see if they exist in the connected add-ins list
   For Each requiredAddIn In requiredAddIns
      If InStr(stringOfAddins, requiredAddIn) Then
        ' The required add-in is in the string of connected add-ins
         msg = msg
      Else
        ' The required add-in is not in the string of connected add-ins, so add the add-in name to a string named "listOfDisconnectedAddins"
         msg = msg & requiredAddIn & vbCrLf
         listOfDisconnectedAddins = requiredAddIn & " " & listOfDisconnectedAddins
         listOfDisconnectedAddins = Trim(listOfDisconnectedAddins)
      End If
   Next
   
' If the msg variable is not blank (it contains at least one add-in's name) handle it, otherwise, do nothing
   If msg = "" Then
        ' There are no critical, unconnected add-ins (yay!)
        ' The script can now exit
   Else
        ' There are critical add-ins that are not connected, so handle this
        MsgBox "The following critical Word Add-In(s) are disabled: " & vbCrLf & vbCrLf & msg & vbCrLf & vbCrLf & "To correct this problem, please save any documents you are working on, then close Word and reopen Word."

            ' I find it extremely hackish to check for each possible key and delete it if found... need to research how to delete the tree
            ' One potential obstacle to this method is that I've seen a DocumentRecovery subkey under Resiliency (only once, while editing this macro), that I haven't researched yet
            
            
            ' Note: Since the WSH Shell has no Enumeration functionality, you cannot
            '       use the WSH Shell object to delete an entire "tree" unless you
            '       know the exact name of every subkey.
            '       If you don't, use the WMI StdRegProv instead.
            ' http://www.robvanderwoude.com/vbstech_registry_wshshell.php

            ' More info on WMI StdRegProv at:
            ' http://msdn.microsoft.com/en-us/library/aa393664(v=vs.85).aspx
            
        ' This is hackish, but it effectively deletes a registry key, if it exists
        If KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Resiliency\DisabledItems\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\12.0\Word\Resiliency\DisabledItems\"
        ElseIf KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Resiliency\StartupItems\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\12.0\Word\Resiliency\StartupItems\"
        ElseIf KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Resiliency\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\12.0\Word\Resiliency\"
        End If
        
        ' To be completely thorough, we can also set the desired LoadBehavior for certain add-ins
        ' This can be done selectively, and only if the LoadBehavior was incorrect, but the quick and dirty way would be to just force the values
        
        WshShell.RegWrite "HKLM\SOFTWARE\Microsoft\Office\Word\Addins\PDFMaker.OfficeAddin\LoadBehavior", 3, "REG_DWORD"

        ' Release the WshShell object
        Set WshShell = Nothing
        
        ' Declare a few variables for the log file
        Dim user, machine, datetime, output
        
        Set WshShell = CreateObject("WScript.Shell")
        user = WshShell.ExpandEnvironmentStrings("%USERNAME%")
        machine = WshShell.ExpandEnvironmentStrings("%COMPUTERNAME%")
        temp = WshShell.ExpandEnvironmentStrings("%TEMP%")
        ' Convert the slashes in Now to hyphens to prevent a fatal error
        datetime = Replace(Now, "/", "-")
        ' Create the string that will be written to the log file
        output = datetime + ", " + user + ", " + machine + ", " + listOfDisconnectedAddins

        ' Write the event to a log file
        logfile = temp + "\addinslog.txt"
        ' http://msdn.microsoft.com/en-us/library/2z9ffy99(v=vs.85).aspx
        ' http://www.devguru.com/technologies/vbscript/quickref/filesystemobject_opentextfile.html
        Set fso = CreateObject("Scripting.FileSystemObject")
        Set logFile = fso.OpenTextFile(logfile, 8, True)
        logFile.WriteLine (output)
        logFile.Close
        Set logFile = Nothing
        Set fso = Nothing
        
        ' Should we clear the variables?
        
        ' Release the WshShell object
        Set WshShell = Nothing
   End If
   
   ' Ardamis.com - We're in your macros, fixing your COM add-ins.
End Sub

While working on this, I found that there were some gaps in my understanding of the sequence of events that occur when Word 2007 disables a COM add-in. Please comment if you find that any of this is inaccurate or incomplete.

What happens when Word launches

A critical key to the whole business of Word add-ins is HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Resiliency

When Word launches, it looks for data under the Resiliency key and a subkey: HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Resiliency\StartupItems

If the StartupItems subkey contains a REG_BINARY value that corresponds to an add-in, Word throws the familiar warning:

Microsoft Office Word
Word experienced a serious problem with the ‘[addin name]’ add-in. If you have seen this message multiple times, you should disable this add-in and check to see if an update is available. Do you want to disable this add-in?
[Yes] [No]

Choosing No at the prompt removes the Resiliency key and allows Word to continue to launch, leaving the LoadBehavior for that add-in unchanged.

Choosing No also writes an Error event to the Application Event Viewer log:

Event Type:	Error
Event Source:	Microsoft Office 12
Event Category:	None
Event ID:	2000
Date:		5/23/2011
Time:		3:15:29 PM
User:		N/A
Computer:	[WORKSTATION_NAME]
Description:
Accepted Safe Mode action : Microsoft Office Word.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Choosing Yes at the prompt removes the StartupItems subkey and creates a new DisabledItems subkey. This DisabledItems subkey will contain a different REG_BINARY value, the data of which contains information about the disabled add-in.

Choosing Yes also writes an Error event to the Application Event Viewer log:

Event Type:	Error
Event Source:	Microsoft Office 12
Event Category:	None
Event ID:	2001
Date:		5/23/2011
Time:		3:12:36 PM
User:		N/A
Computer:	[WORKSTATION_NAME]
Description:
Rejected Safe Mode action : Microsoft Office Word.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

At this point, the add-in is ‘hard-disabled’, but not ‘soft-disabled’.

Word then continues to launch, but without loading the add-in.

To see which add-ins have been hard-disabled, click on the Office Button | Word Options | Add-Ins, and scroll down to “Disabled Application Add-ins”.

To see which add-ins have been soft-disabled, click on the Office Button | Word Options | Add-Ins. Select “COM Add-Ins” in the Manage menu and click Go.

Word is somewhat tricky in this regard, as the add-in will not have a checkmark, but the LoadBehavior registry value will be unchanged. At any other time, the presence of a checkmark is an indication of the LoadBehavior, but when an add-in has been hard-disabled, the box will always be unchecked.

What users can do at this point

Going through Word Options and enabling the hard-disabled COM add-in will remove the Resiliency key. This may not make the add-in immediately available in Word, however.

To immediately load the add-in and gain its functionality, you can check the box. Otherwise, close and reopen Word, which will cause Word to launch with the add-in’s specified LoadBehavior.

In case you were curious about the keyboard shortcuts used to enable the first disabled add-in in the list of disabled add-ins (maybe you wanted to do something with SendKeys, for example), they are:
Alt+F, I, A, A, Tab, Tab, Tab, D, Enter, G, Space, Alt+E, C, Alt+F4.

In summary, deleting the Resiliency key after the “serious problem” prompt, then closing and reopening Word, returns Word to a normal operating state.

What I intend to accomplish with the macro is to re-enable the hard-disabled add-in, return any LoadBehavior values back to the desired settings, then prompt the user to save their work and close and reopen Word.

This should return Word to a working state.

Word 2010 on 64-bit Windows 7

As a bonus, here’s the same macro, with some minor adjustments to run in Word 2010 on Windows 7 64-bit, with Adobe Acrobat 9 Pro’s COM add-in acting as one of the required add-ins. The OneNote add-in is not enabled in Word by default, and the macro below does not attempt to enable it, but does consider it a required add-in. This is done to demonstrate the pop-up window. Note that a file named addinslog.txt must exist in the %TEMP% directory in order for the macro to write the log file.

Option Explicit

' Set up a function to search for a key and return true or false
Public Function KeyExists(key)
    Dim objShell
    On Error Resume Next
    Set objShell = CreateObject("WScript.Shell")
        objShell.RegRead (key)
    Set objShell = Nothing
    If Err = 0 Then KeyExists = True
End Function

Sub AutoExec()
'
' FixMissingAddins Macro
' Display a message box with any critical but not 'Connected' COM add-ins, then fix them programatically
'
' Oliver Baty
' June, 2010 - April, 2011
'
' Information on the Application.COMAddIns array
' http://msdn.microsoft.com/en-us/library/aa831759(v=office.10).aspx
'
' Running macros automatically
' http://support.microsoft.com/kb/286310
'
' Using Windows Scripting Shell (WshShell) to read from and write to the local registry
' http://technet.microsoft.com/en-us/library/ee156602.aspx

' Declare the WshShell variable (this is used to edit the registry)
    Dim WshShell

' Declare the fso and logFile variables (these are used to write to a txt file)
    Dim fso
    Dim logfile

' Create an instance of the WScript Shell object
    Set WshShell = CreateObject("WScript.Shell")

' Declare some other variables
   Dim MyAddin As COMAddIn
   Dim stringOfAddins As String
   Dim listOfDisconnectedAddins As String
   Dim requiredAddIn As Variant
   Dim msg As String

' Notes on deleting registry keys and values in VB
' http://www.vbforums.com/showthread.php?t=425483
' http://www.tek-tips.com/viewthread.cfm?qid=674375
' http://www.robvanderwoude.com/vbstech_registry_wshshell.php

' Create a string containing the names of all 'Connected' COM add-ins named "stringOfAddins"
   For Each MyAddin In Application.COMAddIns
      If MyAddin.Connect = True Then
          stringOfAddins = stringOfAddins & MyAddin.ProgID & " - "
      End If
   Next

' Create an array to hold the names of the critical (required) add-ins named "requiredAddIns"
' Example: change to "Dim requiredAddIns(0 To 4)" if the macro is checking 5 total add-ins)
   Dim requiredAddIns(0 To 1) As String

' Add each required AddIn to the array
   requiredAddIns(0) = "PDFMaker.OfficeAddin"
   requiredAddIns(1) = "OneNote.WordAddinTakeNotesService"
'   requiredAddIns(2) = ""
'   requiredAddIns(3) = ""
'   requiredAddIns(4) = ""

' Cycle through the array of required add-ins, and see if they exist in the connected add-ins list
   For Each requiredAddIn In requiredAddIns
      If InStr(stringOfAddins, requiredAddIn) Then
        ' The required add-in is in the string of connected add-ins
         msg = msg
      Else
        ' The required add-in is not in the string of connected add-ins, so add the add-in name to a string named "listOfDisconnectedAddins"
         msg = msg & requiredAddIn & vbCrLf
         listOfDisconnectedAddins = requiredAddIn & " " & listOfDisconnectedAddins
         listOfDisconnectedAddins = Trim(listOfDisconnectedAddins)
      End If
   Next

' If the msg variable is not blank (it contains at least one add-in's name) handle it, otherwise, do nothing
   If msg = "" Then
        ' There are no critical, unconnected add-ins (yay!)
        ' The script can now exit
   Else
        ' There are critical add-ins that are not connected, so handle this
        MsgBox "The following critical Word Add-In(s) are disabled: " & vbCrLf & vbCrLf & msg & vbCrLf & vbCrLf & "To correct this problem, please save any documents you are working on, then close Word and reopen Word."

            ' I find it extremely hackish to check for each possible key and delete it if found... need to research how to delete the tree
            ' One potential obstacle to this method is that I've seen a DocumentRecovery subkey under Resiliency (only once, while editing this macro), that I haven't researched yet

            ' Note: Since the WSH Shell has no Enumeration functionality, you cannot
            '       use the WSH Shell object to delete an entire "tree" unless you
            '       know the exact name of every subkey.
            '       If you don't, use the WMI StdRegProv instead.
            ' http://www.robvanderwoude.com/vbstech_registry_wshshell.php

            ' More info on WMI StdRegProv at:
            ' http://msdn.microsoft.com/en-us/library/aa393664(v=vs.85).aspx

        ' This is hackish, but it effectively deletes a registry key, if it exists
        If KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Word\Resiliency\DisabledItems\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\14.0\Word\Resiliency\DisabledItems\"
        ElseIf KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Word\Resiliency\StartupItems\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\14.0\Word\Resiliency\StartupItems\"
        ElseIf KeyExists("HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Word\Resiliency\") Then
            WshShell.RegDelete "HKCU\Software\Microsoft\Office\14.0\Word\Resiliency\"
        End If

        ' To be completely thorough, we can also set the desired LoadBehavior for certain add-ins
        ' This can be done selectively, and only if the LoadBehavior was incorrect, but the quick and dirty way would be to just force the values

        WshShell.RegWrite "HKCU\Software\Microsoft\Office\Word\Addins\PDFMaker.OfficeAddin\LoadBehavior", 3, "REG_DWORD"

        ' Release the WshShell object
        Set WshShell = Nothing

        ' Declare a few variables for the log file
        Dim user, machine, temp, datetime, output

        Set WshShell = CreateObject("WScript.Shell")
        user = WshShell.ExpandEnvironmentStrings("%USERNAME%")
        machine = WshShell.ExpandEnvironmentStrings("%COMPUTERNAME%")
        temp = WshShell.ExpandEnvironmentStrings("%TEMP%")
        ' Convert the slashes in Now to hyphens to prevent a fatal error
        datetime = Replace(Now, "/", "-")
        ' Create the string that will be written to the log file
        output = datetime + ", " + user + ", " + machine + ", " + listOfDisconnectedAddins

        ' Write the event to a log file
        logfile = temp + "\addinslog.txt"
        ' http://msdn.microsoft.com/en-us/library/2z9ffy99(v=vs.85).aspx
        ' http://www.devguru.com/technologies/vbscript/quickref/filesystemobject_opentextfile.html
        Set fso = CreateObject("Scripting.FileSystemObject")
        Set logfile = fso.OpenTextFile(logfile, 8, True)
        logfile.WriteLine (output)
        logfile.Close
        Set logfile = Nothing
        Set fso = Nothing
        
        ' Should we clear the variables?

        ' Release the WshShell object
        Set WshShell = Nothing
   End If

   ' Ardamis.com - We're in your macros, fixing your COM add-ins.
End Sub

I’ve written a few tutorials lately on how to reduce page load times. While I use Google’s Page Speed Firefox/Firebug plugin for evaluating pages for load times, there are times when I want a second opinion, or want to point a client to a tool. This post is a collection of links to online tools for testing web page performance.

Page Speed Online

http://pagespeed.googlelabs.com/

Google’s wonderful Page Speed tool, once only available as a Firefox browser Add-on, finally arrives as an online tool. Achieving a high score (ardamis.com is a 96/100) should be on every web developer’s list of things to do before the culmination of a project.

Enter a URL and Page Speed Online will run performance tests based on a set of best practices known to reduce page load times.

  • Optimizing caching – keeping your application’s data and logic off the network altogether
  • Minimizing round-trip times – reducing the number of serial request-response cycles
  • Minimizing request overhead – reducing upload size
  • Minimizing payload size – reducing the size of responses, downloads, and cached pages
  • Optimizing browser rendering – improving the browser’s layout of a page

WebPagetest

http://www.webpagetest.org/

WebPagetest is an excellent application for users who want the same sort of detailed reporting that one gets with Page Speed.

  • Load time speed test on first view (cold cache) and repeat view (hot cache), first byte and start render
  • Optimization checklist
  • Enable keep-alive, HTML compression, image compression, cache static content, combine JavaScript and CSS, and use of CDN
  • Waterfall
  • Response headers for each request

Load Impact

http://loadimpact.com/pageanalyzer.php

Load Impact is an online load testing service that lets you load- and stress test your website over the Internet. The page analyzer analyzes your web page performance by emulating how a web browser would load your page and all resources referenced in it. The page and its referenced resources are loaded and important performance metrics are measured and displayed in a load-bar diagram along with other per-resource attributes such as URL, size, compression ratio and HTTP status code.

ByteCheck

http://www.bytecheck.com/

ByteCheck is a super minimal site that return your page’s all-important time to first byte (TTFB). Time to first byte is the time it takes for a browser to start receiving information after it has started to make the request to the server, and is responsible for a visitor’s first impression that a page is fast- or slow-loading.

Web Page Analyzer

http://websiteoptimization.com/services/analyze/

My opinion is that the Web Page Analyzer report is good for beginners without much technical knowledge of things like gzip compression and Expires headers. It’s a bit dated, and is primarily concerned with basics like how many images a page contains. It tells you how fast you can expect your page to load for dial-up visitors, which strikes me as quaint and not particularly useful.

  • Total HTTP requests
  • Total size
  • Total size per object type (CSS, JavaScript, images, etc.)
  • Analysis of number of files and file size as compared to recommended limits

The Performance Grader

http://www.joomlaperformance.com/component/option,com_performance/Itemid,52/

This is another simplistic analysis of a site, like Web Page Analyzer, that returns its analysis in the form of pass/fail grades on about 14 different tests. I expect that it would be useful for developers who want to show a client a third-party’s analysis of their work, if the third-party is not terribly technically savvy.

One unique thing about this tool, though, is that it totals up the size of all images referenced in CSS files (even those that the current page isn’t using).

  • HTML Size
  • Total Size
  • Total Requests
  • Generation Time
  • Number of Hosts
  • Number of Images
  • Size of Images
  • Number of CSS Files
  • Size of CSS Files
  • Number of Script Files
  • Size of Script Files
  • HTML Encoding
  • Valid HTML
  • Frames