Add a Unique Constraint to Google App Engine

30Jan09

Google App Engine doesn’t have a unique constraint in the classical sense of relational databases. This is a favourite construct of application developers and it’s unfortunate that it’s not present. At the same time, a basic understanding of the underlying datastore points to why it’s tough, or at least inefficient.

There are places where you’re willing to sacrifice some performance in order to guarantee a unique value. Datastore does guarantee uniqueness on its keys, so we can use a secondary helper model to guarantee uniqueness.


class Unique(db.Model):
  @classmethod
  def check(cls, scope, value):
    def tx(scope, value):
      key_name = "U%s:%s" % (scope, value,)
      ue = Unique.get_by_key_name(key_name)
      if ue:
        raise UniqueConstraintViolation(scope, value)
      ue = Unique(key_name=key_name)
      ue.put()
    db.run_in_transaction(tx, scope, value)
class UniqueConstraintViolation(Exception):
  def __init__(self, scope, value):
    super(UniqueConstraintViolation, self).__init__("Value '%s' is not unique within scope '%s'." % (value, scope, ))

This class simply leverages the key uniqueness aspect of datastore to ensure that the value doesn’t exist. A call to check() where a matching scope and value already exists will raise a UniqueConstraintViolation.

To use this, you need to build out a common create method on your models. Below, I’ve used a classmethod to achieve this:


class Account(db.Model):
  name = db.StringProperty()
  email = db.StringProperty() # unique=True - wouldn't that be great?
  @classmethod
  def create(cls, name, email):
    Unique.check("email", email)
    a = Account(name=name, email=email)
    a.put()
    return a

I need to make a call to Unique.check() first to ensure uniqueness (which causes at least a lookup, but likely a lookup and a put – both are a performance hit relative to doing nothing at all), and then I create my own account. Unique.check() will throw if the value is not unique, preventing the Account from being created.

Note that this technique lugs around an additional dictionary in datastore (costing some $$, though realistically not much). Also note that if you jam too many scopes into this class, you’ll get degrading performance (though it’s still just a key lookup which is very efficient).

About these ads


14 Responses to “Add a Unique Constraint to Google App Engine”

  1. This looks very useful. Is the code public domain? Thank you.

  2. 2 Jason Collins

    Sure – have fun!

  3. Thank you!

  4. Isn’t there still a race condition between Unique.check and a.put()? The code formatting is a bit off for me so I might have missed it?

    • 5 Jason Collins

      The call to Unique.check() “reserves” (and stores) the unique value in the Unique model, so if it succeeds the value is all yours. If another process attempts to do a Unique.check() immediately after, it will fail.

      A caveat of this is that if your a.put() fails, you’ve lost your unique value unless you take some other compensating action (e.g., removing the unique value from the Unique model).

  5. 6 Tonny Staunsbrink

    Interrestring solution. But wouldn’t you usually incorporate the uniqueness in the key of the primary entity.

    I think I would use this solution for mutable data, which is not part of the primary (since primary keys should not be mutable anyhow).

  6. While occasionally it’s useful to have a unique value that isn’t also the primary key, in most cases, including this one, it would make more sense to just make the unique value the key name for the model. Then, you’d ensure uniqueness with no extra overhead, and you could look up an entity by its key with a single get, instead of a query.

    • 8 Jason Collins

      I agree Nick and Tonny, it’s always best to work the unique, immutable key into the key of the entity.

      However, I think this is a good, and common, example as people’s email address changes over time, and that doesn’t mean the Account should have to change. A similar example is how you can adjust your Twitter screen_name (which is your login/username), but the rest of the Account is unchanged.

      Something that isn’t mentioned in this post is that if you do allow someone to change this value (e.g., change their email on the Account), you have to explicitly remove the old email from Unique – that is, if you want it to be able to to be used again in the future.

  7. Will this mechanism still work with the high replication datastore?

    • 10 Jason Collins

      Joshua, yes it does work with HRD. The technique is built on .get() / .put() and transactions, so it does not run into any eventual consistency problems.

      • Jason, do you by chance use Django on App Engine? One of the documented limitations of the “Django nonrel” port is that Django’s “unique” database constraint isn’t supported on App Engine. It seems to me that it would be pretty easy to add support for it using this new Unique class of yours. Have you even given it any thought?

      • 12 Jason Collins

        Joshua, we do use some Django on App Engine, though it was long before the nonrel stuff, so we didn’t use any of the database support directly. Generally-speaking, we are moving away from Django to webapp2 (all of our stuff except 1 project are webapp2) because we prefer the very thin transparent pipeline.

      • Jason, we were sold on Django nonrel for its support of Django’s rich Admin functionality. It saves a ton of time building the usual CRUD forms for our models. Does webapp2 have something similar? I haven’t looked into webapp2 at all.

      • 14 Jason Collins

        webapp2 does not have what you describe, though some package might exist that does it. We use wtforms for form processing, though it does not auto-generate forms/handlers from the models; we need to build a form class and associated handler.

        That said, I personally don’t think that auto-gen’d CRUD stuff works well in NoSQL world – proper NoSQL models don’t have a great correlation to something meaningful in user land, not to mention data denormalization.

        Though, I can understand using this sort of tool for developer convenience.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: