Today, I fought with (and conquered!) getting third-party cookies enabled in Safari and Internet Explorer. There was a bunch of scattered information about this problem on the Internet, so I will either add to the noise or help someone out with this post.
Both Safari and Internet Explorer have default privacy settings that are quite harsh on third-party cookies, and both browsers need some consideration in order to get them working correctly. With the proliferation of widgets on the internet, third-party cookies are quite common, so I’m surprised there isn’t a more thorough treatment of the issue (though don’t expect one here!).
Internet Explorer
Internet Explorer (I tested with IE7) has a default privacy setting of “Medium”, which requires that sites issuing third-party cookies must have a Compact Privacy Policy (P3P) header. The best documents I’ve found on this subject are here and here. Pay particular attention to the “Unsatisfactory Cookies” section in the first document; use the second document to determine your appropriate P3P header. Once you are correctly serving an appropriate P3P header, third-party cookies should start to work in Internet Explorer.
Safari
Safari (I tested with Safari 4) has a default third-party cookie setting of “Accept cookies: only from sites I visit.” I took this to mean sites that are the topmost frame (e.g., url showing in the address box). A colleague of mine, Dave Mosher, saw me struggling with this and casually mentioned that the user must first interact with the frame in order to set third-party cookies. This is an interesting, Apple-ish approach; I suppose if I interact with the frame, I have explicitly “visited” it. Further, this solution would work out for many third-party cookie scenarios (e.g., a login) because the cookie isn’t set until after some sort of interaction. In our case however, we need to set the cookie on the very first load before the user has a chance to interact. So, to make this work, the best I could come up with was: for Safari, if our cookie wasn’t present, I introduced an interstitial screen that effectively asked the user to click a link, which allowed us to set a third-party cookie. Unfortunately, I could not find anything more elegant, but this is better than many of the other proposed solutions I found (including popping new windows, using HTML5 Local Storage, etc.). Finally, as long as this cookie is present, you are free to read it before the user interacts with your frame; i.e., if it is a permanent cookie, the user will see the interstitial screen only once (of course depending on your particular cookie setting logic).
Caveat: you may also require the P3P header for Safari. I had it in place while I discovered the Safari solution, so I’m not sure if it works without it.
Filed under: Uncategorized, VendAsta | 4 Comments
Dependency Injection in Python
For testability reasons, we’ve begun using a lot of dependency injection in our Python projects. Imagine that I have a function that does some work and then ultimately calls an API over-the-wire:
import simplejson
from google.appengine.api import urlfetch # App Engine example
def my_method(user_id):
url = 'http://www.example.com/lookup-user/%d.json' % user_id
result = urlfetch.fetch(url)
if result.status_code == 200:
return simplejson.loads(result.content)
return {}
This is bad for testability because when my test code executes my_method(), it will either go out over the wire (super bad!) unless I monkey-patch urlfetch to be some other class (really inconvenient).
So, instead, we can accept an alternate urlfetch mechanism optionally in the method:
from google.appengine.api import urlfetch as urlfetch_builtin
def my_method(user_id, urlfetch=urlfetch_builtin):
url = 'http://www.example.com/lookup-user/%d.json' % user_id
result = urlfetch.fetch(url)
if result.status_code == 200:
return simplejson.loads(result.content)
return {}
So now the test client can pass in an alternate implementation of urlfetch that can return instrumented results and exceptions. This proves to be extremely handy.
One problem with this particular implementation is that it forces the client to do some odd things. Imagine that I have a piece of client code that itself is dependency injected:
def client_function(urlfetch=None):
# here, I must check for an injection so that I know to provide it to my_method
if urlfetch:
return my_method(123, urlfetch=urlfetch)
else:
return my_method(123)
This is unfortunate. A better way to implement dependency injection and prevent this client if statement is to adjust my_method:
from google.appengine.api import urlfetch as urlfetch_builtin
def my_method(user_id, urlfetch=None):
urlfetch = urlfetch or urlfetch_builtin
# ...
Since urlfetch is required by the method, we can take None to mean “use the default implementation”. Now the client code becomes simply:
def client_function(urlfetch=None):
return my_method(123, urlfetch=urlfetch)
Make sure you setup your dependency injection params in this way and save your clients some headaches!
Filed under: Uncategorized | Leave a Comment
Tags: Projects, VendAsta
At VendAsta we develop, among other things, social networking applications deployed to Google AppEngine. One of the aspects of our suite is a feature where users (e.g., Facebook users) can trust experts from our StepRep software.
A desirable, and obvious, feature is to identify the experts that my friends trust, or conversely, which of my friends trust a particular set of experts. From a use case perspective, a lot of this information is jammed on to the screen at one time; that is, we cannot simply query for a single friend or a single expert.
This is a classic graph walk style problem that is relatively easy to scale in a parallel processing environment, but nasty where you can only do serialized queries.
Initially, we felt a bit limited by the AppEngine query model. It is very fast, but only allows for serialized queries.
In reaction, we began investigations into using a MapReduce mechanism to compute these “trust networks” offline. Specifically, we were planning on using MapReduce deployed into Amazon’s Elastic MapReduce engine. The results of these computations were to be stored on Amazon’s S3 (as JSON docs) and simply accessed statically from our AppEngine applications.
This solution, while sound, suffered some drawbacks. We had a difficult time with dealing with the recomputation of the trust networks. We could choose to recompute them daily, but this was wasteful of resources: many of the networks would be completely unchanged. Alternatively, we could queue jobs when new trusts are formed and process just those, but then we have to develop another job to determine which trust networks cascade from the new trust and need recomputation. The absolute nail in the coffin of this approach is that we (typically) use Facebook to define friend networks and their T&C do not allow the storage of friend relationships for any great amount of time, making offline processing difficult and incomplete at best.
We needed another solution that allowed for the real time computation of trust networks.
An architect at our company, Kevin Pierce, also happens to be one of the best reverse engineers I know. He started poking around the gory depths of the AppEngine source and discovered that all of the API calls stub out in MakeSyncCall. This inevitably led to the discovery of the partner MakeCall which yields an RPC object that you can wait on. Of course, if I can wait on one RPC object, I can also wait on many of them. A subsequent release of AppEngine capitalized on this fact allowing the developer to easily make parallel UrlFetch requests. There were some earlier code libraries that also facilitated parallel UrlFetch requests leveraging the same mechanism.
At Google I/O, Ryan Baldwin and I had the opportunity to corner Ryan Barrett (AppEngine’s supersmart Data Store guru) and chat with him about using this approach for other API interaction. Ryan seemed pretty excited about our exploration of this area and indicated that this was something they had intended to expose to developers but, like any successful software project, competing priorities kept getting in the way.
Encouraged, we set about developing code to allow developers to queue up a set of AppEngine API calls and then kick them off in parallel. In particular, for the use case that we were looking to solve was the ability to perform multiple Data Store Queries in parallel; this would allow us to query for each friend’s trusts in parallel and collect the results together.
An example invocation might look something like this:
# set up the async queries
runner = MultiTask()
for uid in user_ids:
query = db.GqlQuery("SELECT __key__ FROM Account WHERE facebook_id = :1", uid)
runner.append(QueryTask(query, limit=1, client_state=uid))
# kick off the work
runner.run()
# peel out the results
for task in runner:
task_result = task.get_result() # will raise any exception that occurred for the given query
print '%s: %s' % (task.client_state, task_result[0])
Here, we’re making multiple parallel queries, though the mechanism allows for mixing around different API calls.
As an aside, this can yield some pretty scary/awesome looking values in the logs files:

Three seconds can chew up a lot of API time!
I’m excited to announce that we’ve bundled up this code into a package and released it to the wild on Google Code as asynctools. So far, we’ve wrapped Query (including GqlQuery) and UrlFetch. Others will come as we have time/motivation. Note that you can combine these together and execute arbitrary combinations of them in parallel.
Send us any feedback, comments, encouragement you have.
Filed under: Uncategorized | 8 Comments
Tags: App Engine, Projects, VendAsta
One of the great things about templating systems is the ability to encapsulate repeatable chunks of code. One of the tough things about templating systems is management of a global “state” to ensure the the overall document that is emitted is as efficient as possible – especially in the HTML world were bandwidth is a performance consideration.
An example of this is where the templating chunks require a particular external resource, like JavaScript. You don’t want each invocation of a repeated sub-template to cause a tag to be emitted; you only need it once for the entire document.
Asp.Net has some handling for JavaScript in their UserControl mechanism (RegisterClientScriptBlock()) that uses a server-side registration mechanism to emit an appropriate document.
Dave Mosher and Brett Zabos, here at VendAsta, have used a pure JavaScript approach to get the JavaScript resources loaded efficiently, built atop the YUI loader. Make sure you check out their escapades.
Filed under: Uncategorized | Leave a Comment
Tags: Projects, VendAsta
The other day, I noticed that the button image for a bookmarked site on my iPhone home screen (our corporate GMail) looked really nice, and was different than before:
![]()
I got to wondering how it came to be.
Scanning the View Source of the GMail page, I didn’t see any tags that were relevant, so I thought it might be handled by iPhone OS, similar to the favicon.ico .
So I pointed my iPhone browser at a web site, added the site to the home screen, and then checked out the web logs to see what was going on.
Sure enough, there were requests for /apple-touch-icon-precomposed.png and then /apple-touch-icon.png. It seems that the precomposed does not apply any processing on the iphone itself (it is used raw), while the second one applies a glass effect. Presumably, the latter one is only requested if the first one 404s.
A nice finishing touch if you have a mobile version of your site. Or perhaps regardless. Add this to the list of “standard web site items” like favicon.ico, robots.txt, 404, 500, sitemap.xml, etc.
Filed under: Uncategorized | Leave a Comment
Tags: Hijinx, VendAsta
Google App Engine doesn’t have a unique constraint in the classical sense of relational databases. This is a favourite construct of application developers and it’s unfortunate that it’s not present. At the same time, a basic understanding of the underlying datastore points to why it’s tough, or at least inefficient.
There are places where you’re willing to sacrifice some performance in order to guarantee a unique value. Datastore does guarantee uniqueness on its keys, so we can use a secondary helper model to guarantee uniqueness.
class Unique(db.Model):
@classmethod
def check(cls, scope, value):
def tx(scope, value):
key_name = "U%s:%s" % (scope, value,)
ue = Unique.get_by_key_name(key_name)
if ue:
raise UniqueConstraintViolation(scope, value)
ue = Unique(key_name=key_name)
ue.put()
db.run_in_transaction(tx, scope, value)
class UniqueConstraintViolation(Exception):
def __init__(self, scope, value):
super(UniqueConstraintViolation, self).__init__("Value '%s' is not unique within scope '%s'." % (value, scope, ))
This class simply leverages the key uniqueness aspect of datastore to ensure that the value doesn’t exist. A call to check() where a matching scope and value already exists will raise a UniqueConstraintViolation.
To use this, you need to build out a common create method on your models. Below, I’ve used a classmethod to achieve this:
class Account(db.Model):
name = db.StringProperty()
email = db.StringProperty() # unique=True - wouldn't that be great?
@classmethod
def create(cls, name, email):
Unique.check("email", email)
a = Account(name=name, email=email)
a.put()
return a
I need to make a call to Unique.check() first to ensure uniqueness (which causes at least a lookup, but likely a lookup and a put – both are a performance hit relative to doing nothing at all), and then I create my own account. Unique.check() will throw if the value is not unique, preventing the Account from being created.
Note that this technique lugs around an additional dictionary in datastore (costing some $$, though realistically not much). Also note that if you jam too many scopes into this class, you’ll get degrading performance (though it’s still just a key lookup which is very efficient).
Filed under: Uncategorized | 14 Comments
Tags: Projects, VendAsta
Recently, I discovered something surprising about Google App Engine’s datastore and ReferenceProperty. Imagine I have a class like the following:
from google.appengine.ext import db
class Home(db.Model):
address = db.StringProperty()
room = db.ReferenceProperty(Room)
where Room is also a db.Model.
Datastore uses a proxying technique such that db.Model objects are created lightly (i.e., with only their key) and any hit to an attribute causes the entire object to be inflated by looking it up in the datastore.
I thought this was achieved with a lazy load technique, meaning that Room in the above example contained the logic to load itself (or more accurately Home would populate room with a proxy to Room that would know how to inflate). To do this, the proxy class would hold only its key and use that key to look itself up. From this, it follows that I could access that key without inflating the object.
It doesn’t work this way!
Instead, the referencing class (Home in the above example) holds the key (in a protected attribute, which you shouldn’t touch) and is responsible for inflating reference types.
I’ll say that again: the referenced object does not inflate itself lazily, but the referencing object does the inflation.
So, I previously thought code like this would not cause a datastore lookup:
# assume home is an instance of Home
room_key = home.room.key()
but I was very wrong. Simply hitting home.room causes home to lookup the entire room.
But what if you only want to get at the key (which the home has, but is protecting)?
This post suggests that the safe way to get this key is to access it via the class attribute, not the instance attribute. Here is the resulting code for my example:
room_key = Home.room.get_value_for_datastore(home)
Note carefully the capitalization (indicating the classes versus objects).
This gives me the room key without causing a room lookup. Over and out.
Filed under: Uncategorized | 2 Comments
Tags: Projects, VendAsta