I’ve been developing a PHP/MySQL web application that will be accessed by multiple users. These users will be both viewing and editing records in the database. Obviously, any situation in which multiple users may be performing operations on the same record puts the integrity of the data at risk.
In the case of my application, there is a very real possibility (a certainty, actually) that two or more people will open the same record at the same time, make changes, and attempt to save these changes. This common concurrent execution problem is known as the “lost update”.
User A opens record “A”.
User B opens record “A”.
User A saves changes to record “A”.
User B saves changes to record “A”.
In this example, User A’s changes are lost – replaced by User B’s changes.
I started to look for a method of preventing this sort of data loss. At first, I wanted to lock the record using a sort of check-in/check-out system. User A would check-out record “A”, and then have exclusive write access to that record until it was checked back in as part of the saving process. There were a number of problems with this method, foremost that User A may decide to not make any changes and so not save the record, which would leave the record in a checked-out state until further administrative action was taken to unlock it.
For awhile, I tried to come up with some ingenious way around this, which usually boiled down to somehow automatically unlocking the record after a period of time. But this is not a satisfactory solution. For one thing, a user may have a legitimate reason to keep the record checked-out for longer periods. For another, User B shouldn’t have to wait for a time-out event to occur if User A is no longer in the record.
So what I eventually came up with is a method of checking whether a record has been changed since it was accessed by the user each time a save is initiated. My particular way of doing this involves comparing timestamps, but other ways exist.
Here’s how I’m implementing my solution to the lost update concurrency issue:
User A creates record “A” and saves it at 9:00 AM. A “last-saved timestamp” of 9:00 AM is generated and saved to the record.
User A opens record “A” at 10:00 AM. An “opened timestamp” of 10:00 AM is generated and written to a hidden (or readonly) input field on the HTML page.
User B opens record “A” at 10:00 AM. An “opened timestamp” of 10:00 AM is generated and written to a hidden (or readonly) input field on the HTML page.
At 10:30 AM, User A attempts to save the record. The “last-saved timestamp” is retrieved from the record. The “opened timestamp” of 10:00 AM is compared to the “last-saved timestamp” of 9:00 AM. Because the record has not been changed since it was opened, the record is saved. A new “last-saved timestamp” of 10:30 AM is generated and saved to the record.
At 11:00 AM, User B attempts to save the record. The “last-saved timestamp” is retrieved from the record. The “opened timestamp” of 10:00 AM is compared to the “last-saved timestamp” of 10:30 AM (User A’s timestamp). Because the record has been changed since it was opened by User B, User B is not allowed to save the record.
User B will have to re-open record “A”, consider the effect that User A’s changes may have, and then make any desired changes.
The main drawback, that I see, is extra work for User B, who has to now review the record as saved by User A before deciding what changes to make. But this is going to be necessary anyway, as changes made between when User B opened and attempted to save the record may influence User B’s update.
The strange thing is that I haven’t seen this offered as a solution on any of the pages I found while Googling for solutions to the access control, lost update and other concurrency-related data loss problems. Lots of people acknowledge the problem and potential for data loss, but few offer solutions on the application level – preferring to rely on a database engine’s ability to lock rows.