Why You Shouldn’t Rely on ActiveRecord Validations for Data Integrity

I started trying to answer questions on stackoverflow today. It’s a new thing so that’s exciting unto itself but I came across this question that made me feel there was something worth blogging about. To recap if you’re “too focused on time efficiency” to click through to the question, the asker:

  1. Created a Rails migration that added a column to a table. The column had a limit of 40 characters.
  2. He was still able to add records that exceeded the limit.

The accepted solution was to use an ActiveRecord validation to enforce the limit, because the underlying RDBMS (SQLite3) didn’t actually support character limits on fields. The asker raised a valid concern about this though. Why bother putting the limit in the migration then? We’re putting the limit in two places?! That’s not very DRY! While, yes, it’s not very DRY it is still necessary. Here’s why:

Database Error Messages Leave Even Developers Scratching Their Head

A robust RDBMS does a GREAT job of keeping your data integrity in check. It DOES NOT always do a good job of telling you why the record you just tried to insert was invalid. Even some of the better error messages I’ve seen from a violated database constraint would be incredibly cryptic to an end user.

Not All Databases Are Created Equal

In an ideal world we could just put all the constraints in the database and ActiveRecord would pick up on them and create user friendly error messages when they were violated. In the real world different RDBMSs report on their schema at varying levels of accuracy. You could probe the schema all you want and for some reason it just won’t tell you about the unique constraint on those three columns in your ORDERS table. Also not all databases are going to support the same constraint types.

Concurrency

You could say concurrency is a valid answer any time someone says “you can’t do that” in web development and it’s usually right. In this case, especially when dealing with uniqueness constraints, consider two users each entering a record that would cause the other record to become invalid.

  1. First user saves, ActiveRecord checks the database to make sure all validations pass.
  2. Second user saves, ActiveRecord checks the database to make sure all validations pass.
  3. ActiveRecord begins writing the first users record to the database.
  4. ActiveRecord begins writing the second users record to the database.

If you don’t have constraints in your database to backup your ActiveRecord validations then both records successfully get written to the database. Database level constraints are the last line of defence when it comes to data validations. If you have database constraints we get a fifth step. When the last record gets committed the database does its own checking to see if any constraints are violated.

ActiveRecord Validations Are Easily Circumvented

It’s remarkably easy for ActiveRecord validations to be skipped, as per the Rails validations guide. There are plenty of valid use cases for doing this as well, but if you want to maintain data integrity you better let your database do it’s thing.

Did I miss anything? Can you think of any other reasons you should back your ActiveRecord validations with database constraints?