Today, I fought with (and conquered!) getting third-party cookies enabled in Safari and Internet Explorer. There was a bunch of scattered information about this problem on the Internet, so I will either add to the noise or help someone out with this post.
Both Safari and Internet Explorer have default privacy settings that are quite harsh on third-party cookies, and both browsers need some consideration in order to get them working correctly. With the proliferation of widgets on the internet, third-party cookies are quite common, so I’m surprised there isn’t a more thorough treatment of the issue (though don’t expect one here!).
Internet Explorer
Internet Explorer (I tested with IE7) has a default privacy setting of “Medium”, which requires that sites issuing third-party cookies must have a Compact Privacy Policy (P3P) header. The best documents I’ve found on this subject are here and here. Pay particular attention to the “Unsatisfactory Cookies” section in the first document; use the second document to determine your appropriate P3P header. Once you are correctly serving an appropriate P3P header, third-party cookies should start to work in Internet Explorer.
Safari
Safari (I tested with Safari 4) has a default third-party cookie setting of “Accept cookies: only from sites I visit.” I took this to mean sites that are the topmost frame (e.g., url showing in the address box). A colleague of mine, Dave Mosher, saw me struggling with this and casually mentioned that the user must first interact with the frame in order to set third-party cookies. This is an interesting, Apple-ish approach; I suppose if I interact with the frame, I have explicitly “visited” it. Further, this solution would work out for many third-party cookie scenarios (e.g., a login) because the cookie isn’t set until after some sort of interaction. In our case however, we need to set the cookie on the very first load before the user has a chance to interact. So, to make this work, the best I could come up with was: for Safari, if our cookie wasn’t present, I introduced an interstitial screen that effectively asked the user to click a link, which allowed us to set a third-party cookie. Unfortunately, I could not find anything more elegant, but this is better than many of the other proposed solutions I found (including popping new windows, using HTML5 Local Storage, etc.). Finally, as long as this cookie is present, you are free to read it before the user interacts with your frame; i.e., if it is a permanent cookie, the user will see the interstitial screen only once (of course depending on your particular cookie setting logic).
Caveat: you may also require the P3P header for Safari. I had it in place while I discovered the Safari solution, so I’m not sure if it works without it.
Filed under: VendAsta | Leave a Comment
Dependency Injection in Python
For testability reasons, we’ve begun using a lot of dependency injection in our Python projects. Imagine that I have a function that does some work and then ultimately calls an API over-the-wire:
import simplejson
from google.appengine.api import urlfetch # App Engine example
def my_method(user_id):
url = 'http://www.example.com/lookup-user/%d.json' % user_id
result = urlfetch.fetch(url)
if result.status_code == 200:
return simplejson.loads(result.content)
return {}
This is bad for testability because when my test code executes my_method(), it will either go out over the wire (super bad!) unless I monkey-patch urlfetch to be some other class (really inconvenient).
So, instead, we can accept an alternate urlfetch mechanism optionally in the method:
from google.appengine.api import urlfetch as urlfetch_builtin
def my_method(user_id, urlfetch=urlfetch_builtin):
url = 'http://www.example.com/lookup-user/%d.json' % user_id
result = urlfetch.fetch(url)
if result.status_code == 200:
return simplejson.loads(result.content)
return {}
So now the test client can pass in an alternate implementation of urlfetch that can return instrumented results and exceptions. This proves to be extremely handy.
One problem with this particular implementation is that it forces the client to do some odd things. Imagine that I have a piece of client code that itself is dependency injected:
def client_function(urlfetch=None):
# here, I must check for an injection so that I know to provide it to my_method
if urlfetch:
return my_method(123, urlfetch=urlfetch)
else:
return my_method(123)
This is unfortunate. A better way to implement dependency injection and prevent this client if statement is to adjust my_method:
from google.appengine.api import urlfetch as urlfetch_builtin
def my_method(user_id, urlfetch=None):
urlfetch = urlfetch or urlfetch_builtin
# ...
Since urlfetch is required by the method, we can take None to mean “use the default implementation”. Now the client code becomes simply:
def client_function(urlfetch=None):
return my_method(123, urlfetch=urlfetch)
Make sure you setup your dependency injection params in this way and save your clients some headaches!
Filed under: Uncategorized | Leave a Comment
Tags: Projects, VendAsta
At VendAsta we develop, among other things, social networking applications deployed to Google AppEngine. One of the aspects of our suite is a feature where users (e.g., Facebook users) can trust experts from our StepRep software.
A desirable, and obvious, feature is to identify the experts that my friends trust, or conversely, which of my friends trust a particular set of experts. From a use case perspective, a lot of this information is jammed on to the screen at one time; that is, we cannot simply query for a single friend or a single expert.
This is a classic graph walk style problem that is relatively easy to scale in a parallel processing environment, but nasty where you can only do serialized queries.
Initially, we felt a bit limited by the AppEngine query model. It is very fast, but only allows for serialized queries.
In reaction, we began investigations into using a MapReduce mechanism to compute these “trust networks” offline. Specifically, we were planning on using MapReduce deployed into Amazon’s Elastic MapReduce engine. The results of these computations were to be stored on Amazon’s S3 (as JSON docs) and simply accessed statically from our AppEngine applications.
This solution, while sound, suffered some drawbacks. We had a difficult time with dealing with the recomputation of the trust networks. We could choose to recompute them daily, but this was wasteful of resources: many of the networks would be completely unchanged. Alternatively, we could queue jobs when new trusts are formed and process just those, but then we have to develop another job to determine which trust networks cascade from the new trust and need recomputation. The absolute nail in the coffin of this approach is that we (typically) use Facebook to define friend networks and their T&C do not allow the storage of friend relationships for any great amount of time, making offline processing difficult and incomplete at best.
We needed another solution that allowed for the real time computation of trust networks.
An architect at our company, Kevin Pierce, also happens to be one of the best reverse engineers I know. He started poking around the gory depths of the AppEngine source and discovered that all of the API calls stub out in MakeSyncCall. This inevitably led to the discovery of the partner MakeCall which yields an RPC object that you can wait on. Of course, if I can wait on one RPC object, I can also wait on many of them. A subsequent release of AppEngine capitalized on this fact allowing the developer to easily make parallel UrlFetch requests. There were some earlier code libraries that also facilitated parallel UrlFetch requests leveraging the same mechanism.
At Google I/O, Ryan Baldwin and I had the opportunity to corner Ryan Barrett (AppEngine’s supersmart Data Store guru) and chat with him about using this approach for other API interaction. Ryan seemed pretty excited about our exploration of this area and indicated that this was something they had intended to expose to developers but, like any successful software project, competing priorities kept getting in the way.
Encouraged, we set about developing code to allow developers to queue up a set of AppEngine API calls and then kick them off in parallel. In particular, for the use case that we were looking to solve was the ability to perform multiple Data Store Queries in parallel; this would allow us to query for each friend’s trusts in parallel and collect the results together.
An example invocation might look something like this:
# set up the async queries
runner = MultiTask()
for uid in user_ids:
query = db.GqlQuery("SELECT __key__ FROM Account WHERE facebook_id = :1", uid)
runner.append(QueryTask(query, limit=1, client_state=uid))
# kick off the work
runner.run()
# peel out the results
for task in runner:
task_result = task.get_result() # will raise any exception that occurred for the given query
print '%s: %s' % (task.client_state, task_result[0])
Here, we’re making multiple parallel queries, though the mechanism allows for mixing around different API calls.
As an aside, this can yield some pretty scary/awesome looking values in the logs files:
Three seconds can chew up a lot of API time!
I’m excited to announce that we’ve bundled up this code into a package and released it to the wild on Google Code as asynctools. So far, we’ve wrapped Query (including GqlQuery) and UrlFetch. Others will come as we have time/motivation. Note that you can combine these together and execute arbitrary combinations of them in parallel.
Send us any feedback, comments, encouragement you have.
Filed under: Uncategorized | 8 Comments
Tags: App Engine, Projects, VendAsta
One of the great things about templating systems is the ability to encapsulate repeatable chunks of code. One of the tough things about templating systems is management of a global “state” to ensure the the overall document that is emitted is as efficient as possible – especially in the HTML world were bandwidth is a performance consideration.
An example of this is where the templating chunks require a particular external resource, like JavaScript. You don’t want each invocation of a repeated sub-template to cause a tag to be emitted; you only need it once for the entire document.
Asp.Net has some handling for JavaScript in their UserControl mechanism (RegisterClientScriptBlock()) that uses a server-side registration mechanism to emit an appropriate document.
Dave Mosher and Brett Zabos, here at VendAsta, have used a pure JavaScript approach to get the JavaScript resources loaded efficiently, built atop the YUI loader. Make sure you check out their escapades.
Filed under: Uncategorized | Leave a Comment
Tags: Projects, VendAsta
The other day, I noticed that the button image for a bookmarked site on my iPhone home screen (our corporate GMail) looked really nice, and was different than before:
![]()
I got to wondering how it came to be.
Scanning the View Source of the GMail page, I didn’t see any tags that were relevant, so I thought it might be handled by iPhone OS, similar to the favicon.ico .
So I pointed my iPhone browser at a web site, added the site to the home screen, and then checked out the web logs to see what was going on.
Sure enough, there were requests for /apple-touch-icon-precomposed.png and then /apple-touch-icon.png. It seems that the precomposed does not apply any processing on the iphone itself (it is used raw), while the second one applies a glass effect. Presumably, the latter one is only requested if the first one 404s.
A nice finishing touch if you have a mobile version of your site. Or perhaps regardless. Add this to the list of “standard web site items” like favicon.ico, robots.txt, 404, 500, sitemap.xml, etc.
Filed under: Uncategorized | Leave a Comment
Tags: Hijinx, VendAsta
Google App Engine doesn’t have a unique constraint in the classical sense of relational databases. This is a favourite construct of application developers and it’s unfortunate that it’s not present. At the same time, a basic understanding of the underlying datastore points to why it’s tough, or at least inefficient.
There are places where you’re willing to sacrifice some performance in order to guarantee a unique value. Datastore does guarantee uniqueness on its keys, so we can use a secondary helper model to guarantee uniqueness.
class Unique(db.Model):
@classmethod
def check(cls, scope, value):
def tx(scope, value):
key_name = "U%s:%s" % (scope, value,)
ue = Unique.get_by_key_name(key_name)
if ue:
raise UniqueConstraintViolation(scope, value)
ue = Unique(key_name=key_name)
ue.put()
db.run_in_transaction(tx, scope, value)
class UniqueConstraintViolation(Exception):
def __init__(self, scope, value):
super(UniqueConstraintViolation, self).__init__("Value '%s' is not unique within scope '%s'." % (value, scope, ))
This class simply leverages the key uniqueness aspect of datastore to ensure that the value doesn’t exist. A call to check() where a matching scope and value already exists will raise a UniqueConstraintViolation.
To use this, you need to build out a common create method on your models. Below, I’ve used a classmethod to achieve this:
class Account(db.Model):
name = db.StringProperty()
email = db.StringProperty() # unique=True - wouldn't that be great?
@classmethod
def create(cls, name, email):
Unique.check("email", email)
a = Account(name=name, email=email)
a.put()
return a
I need to make a call to Unique.check() first to ensure uniqueness (which causes at least a lookup, but likely a lookup and a put – both are a performance hit relative to doing nothing at all), and then I create my own account. Unique.check() will throw if the value is not unique, preventing the Account from being created.
Note that this technique lugs around an additional dictionary in datastore (costing some $$, though realistically not much). Also note that if you jam too many scopes into this class, you’ll get degrading performance (though it’s still just a key lookup which is very efficient).
Filed under: Uncategorized | 3 Comments
Tags: Projects, VendAsta
Recently, I discovered something surprising about Google App Engine’s datastore and ReferenceProperty. Imagine I have a class like the following:
from google.appengine.ext import db
class Home(db.Model):
address = db.StringProperty()
room = db.ReferenceProperty(Room)
where Room is also a db.Model.
Datastore uses a proxying technique such that db.Model objects are created lightly (i.e., with only their key) and any hit to an attribute causes the entire object to be inflated by looking it up in the datastore.
I thought this was achieved with a lazy load technique, meaning that Room in the above example contained the logic to load itself (or more accurately Home would populate room with a proxy to Room that would know how to inflate). To do this, the proxy class would hold only its key and use that key to look itself up. From this, it follows that I could access that key without inflating the object.
It doesn’t work this way!
Instead, the referencing class (Home in the above example) holds the key (in a protected attribute, which you shouldn’t touch) and is responsible for inflating reference types.
I’ll say that again: the referenced object does not inflate itself lazily, but the referencing object does the inflation.
So, I previously thought code like this would not cause a datastore lookup:
# assume home is an instance of Home
room_key = home.room.key()
but I was very wrong. Simply hitting home.room causes home to lookup the entire room.
But what if you only want to get at the key (which the home has, but is protecting)?
This post suggests that the safe way to get this key is to access it via the class attribute, not the instance attribute. Here is the resulting code for my example:
room_key = Home.room.get_value_for_datastore(home)
Note carefully the capitalization (indicating the classes versus objects).
This gives me the room key without causing a room lookup. Over and out.
Filed under: Uncategorized | 2 Comments
Tags: Projects, VendAsta
We have a templatetag, called scurl, that needs to look at the HttpRequest object. Django’s templating system provides a straight-forward, yet wordy, mechanism to pass the request object in to the template:
def my_view(request):
return render_to_response("myview.html", context_instance=RequestContext(request))
The render method of our scurl templatetag gets access to this context and thus access to the HttpRequest.
So far, so good.
In our project, we also use inclusion_tags to include common chunks of HTML into pages. This sort of tag looks something like this:
@register.inclusion_tag("html-to-include.html")
def my_include_tag(myparam):
return {"inclusion_param":myparam}
inclusion_tag is a nice time saving decorator that automatically pulls up the appropriate template and renders it. However, when we included our scurl tag (remember, this tag depends on the HttpRequest) inside the html-to-include.html template, everything blew up. Looking at the Django source, I was surprised to see that when processing the inclusion_tag, a new context is created and the parent context is not used. At first, this seemed crazy, but in thinking about it more, it’s the right thing to do: we don’t necessarily want our parent context colliding with expectations of the inclusion_tag. That is, because of this separate context, there is a way to formally adapt our parent context to the context of the inclusion_tag.
So now, the challenge is to adapt the context. The inclusion_tag decorator provides a handy flag takes_context that tells Django to provide the context to the template tag. To do this, we need to alter our signature slightly; the context must be the first parameter:
@register.inclusion_tag("html-to-include.html", takes_context=True)
def my_include_tag(context, myparam):
return {"inclusion_param":myparam}
The parameter context is now passed the parent RequestContext.
Now, misunderstanding #2 came along: we thought that this context would automatically be propagated when rendering the template. It was not; a new Context instance was still being created. Diving into the source, inclusion_tag has an undocumented keyword parameter context_class that allows you to specify what context that Django instantiates. So this led to this trial:
@register.inclusion_tag("html-to-include.html", takes_context=True, context_class=RequestContext)
def my_include_tag(context, myparam):
return {"inclusion_param":myparam}
This failed because the __init__ signature for RequestContext looks nothing like the Context __init__ signature. And in hindsight, it’s a naive trial because, unless there’s some serious magic under the hood, how would the actual request get passed through to the RequestContext that was being instantiated?
But wait! The intent of the inclusion_tag is to provide an adapter between contexts. AND python is dynamically typed. The latter is relevant because my initial tag scurl doesn’t actually care that it has a RequestContext, only that context['request'] returns what it needs.
So, all we had to do was implement the adapter (dumping the erroneous context_class bit):
@register.inclusion_tag("html-to-include.html", takes_context=True)
def my_include_tag(context, myparam):
return {"inclusion_param":myparam, "request":context['request']}
Pretty cool, but rife with a pretty deep understanding of Django internals that is vital for any templatetag author. Would have never sorted this out without access to the Django source…
Filed under: Uncategorized | 5 Comments
Tags: Projects, VendAsta
So, here we sit, uncertain of our first steps….
Here at VendAsta, we’ve just kicked off our Friday afternoon jam sessions. Starting at noon, we bring in lunch and all get together in adhoc groups and work on projects of interest. Anything. Well, anything software.
It’s an odd feeling though to have all constraints lifted and actually believe that we can just go out and play, experiment, learn, research, anything.
But now the group is starting to self-organize and some ideas are emerging. I’m confident that these sessions will rock as people get going on things.
It’s gonna be great!
Filed under: Uncategorized | 3 Comments
Tags: Projects, VendAsta
Code review is essential.
Code review is a vital mechanism for ensuring conventions and patterns, as well as spreading knowledge of solutions and technologies throughout a team.
We have been evaluating Crucible as a formal tool for tracking comments and laying the comments alongside the actual code. The cool thing is that it integrates (somewhat) with Fisheye, Jira, and Subversion – all tools that we use. However, it comes with a steep price tag, and its baked-in process is too heavyweight for our small Scrum teams.
So I recently went on a search for an alternative.
In short order, I came across rietveld, which has come out of Guido van Rossum’s 20% Google project (originally Mondrian). This app is built on Google App Engine and integrates with Subversion (along with Perforce from the original Mondrian days).
We’ll try it out and see how it goes.
The best side-effect: in rietveld, we have Guido showing us the best practices in building a Django app on App Engine! Best practice information in this space is hard to come by.
Filed under: Uncategorized | 3 Comments
Tags: Projects, VendAsta
Recent Entries
- Third-party Cookies in Safari, Internet Explorer
- Dependency Injection in Python
- asynctools – A Tale of Two Queries
- Getting a templating mechanism to render the HTML you want.
- Fancy iPhone Home Screen Buttons for Bookmarked Sites
- Add a Unique Constraint to Google App Engine
- Retrieving the key from a ReferenceProperty in Google App Engine
- Django templatetag, RequestContext, and inclusion_tag
- Announcing VendAsta Friday Afternoon Jam Sessions
- Code review is essential.
- Holy Schmolie, Batman!
Categories
- Uncategorized (15)
- VendAsta (4)