Reducing Data Store Contention in App Engine get_or_insert

I was giving a demo at the Apple Academix Conference during a keynote where I use a Google App Engine application to simulate a clicker-like activity in a course. It is a simple multi-user number guessing game. Whenever the audience participation was too good and too many people started clicking on the application I would get errors – it made the demo in the middle of my talk lose a bit of credibility – but it was also funny.

I finally saw the error and spent a few hours to track the issue down. I was getting the following message:

TransactionFailedError: too much contention on these datastore
entities. please try again.

I had been using get_or_insert() pretty much with reckless abandon and wondered if it was smart enough to optimistically do a get() and see if the object was there before starting a transaction to do the get() and then put() if necessary. But given the behavior I was seeing – it looked like it was starting a transaction every time.

I had 20 participants and after the first click in the application, there were no new objects in the application and 20 people clicking within a 10 second interval were causing the error to occur. I suspect but cannot confirm that at some level the AE framework marks you as abusing transactions and then it takes a while before load drops down that it lets you once again use transactions.

So I decided to reduce my transaction abuse and build an optimistic get or insert that does the get() first and then if the record does not exist do the get_or_insert(). This way once objects start to exist there will be no more transactions. And yes, my next optimization is to use memcache.

So this was my approach.

In my bad code, the pattern looked like this:

course = LTI_OrgCourse.get_or_insert("key:"+course_id, parent=org)

So I wrote this function:

def opt_get_or_insert(obj, key, parent=None):
logging.info("OPT key="+str(key))
if parent == None:
mod = obj.get_by_key_name(key)
if mod != None : return mod
mod = obj.get_or_insert(key)
else:
mod = obj.get_by_key_name(key, parent=parent)
if mod != None : return mod
mod = obj.get_or_insert(key, parent=parent)
return mod

And then changed all my calls to be:

course = opt_get_or_insert(LTI_OrgCourse,"key:"+course_id, parent=org)

I wish this were part of the App Engine library instead of something I had to write. This may not be the most general approach and there may be better ways to do this – but for now I hope that it avoids my dreaded “TransactionFailedError: too much contention on these datastore entities. please try again.” message even under low load levels.

Now off to write a load test for this to see if I can reproduce it.