At the opening panel of XML 2007, medical Doug Crockford waxed eloquent on the weak security foundations of the Web (and managed, tooth in a truly mystifying rhetorical move, to blame XML for them; apparently if XML had not been developed, people’s attention would not have been distracted from what he regards as core HTML development topics).
So during the discussion period I asked him “If you are concerned about security (and right you are to be so), then what on earth can have possessed you to promote a notation like JSON, which is most conveniently parsed using a call to
Digression: before I go any further I should point out that JSON is easy to parse, and people have indeed provided parsers for it, so that the developer doesn’t have to use
eval. And last April, Doug Crockford argued in a piece on JSON and browser security that JSON is security neutral (as near as I can make out, because the security problem is in the code that calls
eval when it shouldn’t, so it’s really not JSON’s fault, JSON is just an innocent bystander). So there is no necessary relation between JSON and
eval and code injection attacks.
Those of sufficient age will well remember the GML systems shipped by IBM (DCF GML) and the University of Waterloo, and lots of people are still using LaTeX (well, some anyway, lots of them in computer science departments). These systems still exist, surely, on some machines, but I will describe them, as I think of them, in the past tense; apologies to those for whom they are still living systems. LaTeX and GML both supported descriptive markup; both provided extensible vocabularies for document structure that you could use to make reusable documents. And both were built on top of a lower-level formatting system, so in both systems it was possible, whenever it turned out to seem necessary, to drop down into the lower-level system (TeX in the case of LaTeX, Script in the case of GML).
Now, in both systems dropping down into the lower level notation was considered a little doubtful, a slightly bad practice that was tolerated because it was often so useful. It was better to avoid it if you could. And if you were disciplined, you could write a LaTeX or GML document without ever lapsing into the lower-level procedural notation. But the quality of your results depended very directly on the level of self-discipline you were able to maintain.
The end result turned out, in both cases, to be: almost no GML or LaTeX documents of any size are actually pure descriptive markup. At least, not the ones I have seen; and I have seen a few. Almost all documents end up a mixture of high- and low-level markup that cannot be processed in a purely declarative way. Why? Because there was no short-term penalty for violating the declarativity of the markup, and there was often a short-term gain that, at the crucial moment, masked the long-term cost. In this respect, JSON seems to be re-inventing the flaws of notations first developed a few decades ago.
To keep systems clean, you need to drive the right behavior.
To say “You don’t have to use
eval — JSON has a very simple syntax and you can parse it yourself, or use an off the shelf parser, and in so doing protect yourself against the security issue,” seems to ignore an important fact about notations: they make some things easier and (necessarily) some things harder. They don’t force you to do things the easy way; they don’t prevent you from doing them the hard way. They don’t have to. The gentle pressure of the notation can be enough. It’s like gravity: it never lets up.
If the notation makes a dangerous or dirty practice easy, then the systems built with it will be spotlessly clean if the users have the self-discipline to keep it clean. For most of us, that means: not very clean.
OK, end of digression.
Be careful about slopes, slippery or otherwise. Gravity never sleeps.