The OOXML debates (non-combatant’s perspective)

[21-22 July 2008]

So far, I have managed to avoid participating in the debates over standardizing OOXML, and I don’t plan for that to change. But my evil twin Enrique and I spent some wickedly enjoyable time this afternoon reading a lot of postings in that debate, from a variety of sources, when I should have been working on other things. (“Log it as ‘Professional – continuing education’,” suggested Enrique. I may do that.)

It’s interesting to be able to observe a hard-fought technical battle in which (other people’s) feelings run high but in which one does not have a large personal stake. So many rhetorical maneuvers are familiar, the deterioration of the quality of the argument brings back so many memories of other technical arguments in which (distracted by caring about the outcome) the observer may not have been able to appreciate the rhetorical ingenuity of some of the contributions.

What strikes both Enrique and me is how distinct the styles of argumentation on the various sides of the debate are. We counted three, not two, in this battle, but we could be undercounting.

On one side, there is a class of contributions carefully kept as thoroughly emotionless as possible, focusing exclusively on technical (or at least substantive) issues — even when the contribution was intended to persuade others of a course of action. This seems, at first, an unusual rhetorical choice: I think most advertisers tend to prefer enthusiasm to a studied lack of emotion in trying to sell things. Still, this class includes some of the people whose judgement I have the most reason to respect, and in an over-heated environment a strict objectivity can be immensely attractive.

There is a second class of contributions, which provide a complex mix of a more emotional, excitable, even passionate, style of argumentation, which is however almost always tethered to concrete, verifiable (or falsifiable) propositions about technical properties of OOXML (and ODF), about process issues, and so on. The contributions of this class are by no means always well reasoned or insightful, but they are all recognizably arguments which can be refuted.

And there is a third class, which contains some of the most inventive ad hominem attacks, imaginative name-calling, and insidious smears I have ever seen outside of recent U.S. national electoral politics.

What is striking and puzzling to me is how cleanly the three different rhetorical styles seem to me to map to different positions (let me call them left and right, without mapping left/right into pro/con) on OOXML. If you see a statement that could in principle be verified or falsified by an impartial third party, there is a much better than even chance that it’s from a contribution arguing, let us call it, the left-hand position. And if you see an infuriatingly smug piece which avoids addressing actual technical issues and confines itself to name-calling, slander, and innuendo, there is a very strong chance that it’s taking a right-hand position. (I’m speaking here mostly of bloggers and essayists, not of those who have commented on various blog posts — the blog comments are uniformly smug and infuriating regardless of handedness.)

I have tried not to say explicitly which position each of these styles is associated with, because if Enrique and I are right then all you have to do is (re)read some of the rhetorical barrages of the last year or two to see which is which. (Those of my readers who care about the outcome, or about the health and reputation of the institutions involved, may find this too painful to contemplate. I’m sorry; you don’t have to if you don’t want to.) And if we’re wrong (and we may be — we only had stomach for an afternoon’s worth of the stuff, not more), then there’s no fairness in pointing the finger of blame at just one side for the incivility that can be seen in the discussion of OOXML.

And in any case, as Enrique points out with a certain malicious glee, “Most people who don?~~t look into it will assume that the merits of the technical arguments must be with the first or second groups, because they don’t descend (or more correctly they descend less often) to slander and name-calling. But there is no rule that says that just because those on one side of an argument argue unfairly or irrelevantly, or act with infuriating disregard of basic rules of courteous technical discussion, then it’s safe to conclude that they have the wrong end of the technical stick, any more than it’s safe to conclude that an invalid argument has reached a false conclusion. Unfairness and low behavior don’t mean people aren’t right in the end.”

Enrique may be right. But watching the OOXML debates serves as a salutary reminder that when some in a technical discussion descend to name-calling and slander (and what better to spice up a blog with?), the animosities created during the process will hover over the result of the decision for a long time.


Memo to self: in future, try to be calmer and more fair in discussions.

(“Yeah,” I hear Enrique mutter. “Leave the dirty work to me.”)

XSD 1.1 is in Last Call

Yesterday the World Wide Web Consortium published new drafts of its XML Schema Definition Language (XSD) 1.1, as ‘last-call’ drafts.

The idiom has an obscure history, but is clearly related to the last call for orders in pubs which must close by a certain hour. The working group responsible for a specification labels it ‘last call’, as in ‘last call for comments’, to indicate that the working group believes the spec is finished and ready to move forward. If other working groups or external readers have been waiting to review the document, thinking “there’s no point reviewing it now because they are still changing things”, the last call is a signal that the responsible working group has stopped changing things, so if you want to review it, it’s now or never.

The effect, of course, can be to evoke a lot of comments that require significant rework of the spec, so that in fact it would be foolish for a working group to believe they are essentially done when they reach last call. (Not that it matters what the WG thinks: a working group that believes last call is the end of the real work will soon be taught better.)

In the case of XSD 1.1, this is the second last call publication both for the Datatypes spec and for the Structures spec (published previously as last-call working drafts in February 2006 and in August 2007, respectively). Each elicited scores of comments: by my count there are 126 Bugzilla issues on Datatypes opened since 17 February 2006, and 96 issues opened against Structures since 31 August 2007. We have closed all of the substantive comments, most by fixing the problem and a few (sigh) by discovering either that we could not reach consensus on what to do about the problem (or in some cases could not reach consensus about whether there was really a problem before us) or that we could not make the requested change without more delay than seemed warrantable. There are still a number of ‘editorial’ issues open, which are expected not to affect the conformance requirements for the spec or to change the results of anyone’s review of the spec, and which we therefore hope to be able to close after going to last call.

XSD 1.1 is, I think, somewhat improved over XSD 1.0 in a number of ways, ranging from the very small but symbolically very significant to much larger changes. On the small but significant side: the spec has a name now (XSD) that is distinct from the generic noun phrase used to describe the subject matter of the spec (XML schemas), which should make it easier for people to talk about XML schema languages other than XSD without confusing some listeners. On the larger side:

  • XSD 1.1 supports XPath 2.0 assertions on complex and simple types. The subset of XPath 2.0 defined for assertions in earlier drafts of XSD 1.1 has been dropped; processors are expected to support all of XPath 2.0 for assertions. (There is, however, a subset defined for conditional type assignment, although here too schema authors are allowed to use, and processors are allowed to support, full XPath.)
  • ‘Negative’ wildcards are allowed, that is wildcards which match all names except some specified set. The excluded names can be listed explicitly, or can be “all the elements defined in the schema” or “all the elements present in the content model”.
  • The xs:redefine element has been deprecated, and a new xs:override element has been defined which has clearer semantics and is easier to use.

Some changes vis-a-vis 1.0 were already visible in earlier drafts of 1.1:

  • The rules requiring deterministic content models have been relaxed to allow wildcards to compete with elements (although the determinism rule has not been eliminated completely, as some would prefer).
  • XSD 1.1 supports both XML 1.0 and XML 1.1.
  • A conditional inclusion mechanism is defined for schema documents, which allows schema authors to write schema documents that will work with multiple versions of XSD. (This conditional inclusion mechanism is not part of XSD 1.0, and cannot be added to it by an erratum, but there is no reason a conforming XSD 1.0 processor cannot support it, and I encourage makers of 1.0 processors to add support for it.)
  • Schema authors can specify various kinds of ‘open content’ for content models; this can make it easier to produce new versions of a vocabulary with the property that any document valid against the new vocabulary will also be valid against the old.
  • The Datatypes spec includes a precisionDecimal datatype intended to support the IEEE 754R floating-point decimal specification recently approved by IEEE.
  • Processors are allowed to support primitive datatypes, and datatype facets, additional to those defined in the specification.
  • We have revised many, many passages in the spec to try to make them clearer. It has not been easy to rewrite for clarity while retaining the kind of close correspondence to 1.0 that allows the working group and implementors to be confident that the rewrite has not inadvertently changed the conformance criteria. Some readers will doubtless wish that the working group had done more in this regard. But I venture to hope that many readers will be glad for the improvements in wording. The spec is still complex and some parts of it still make for hard going, but I think the changes are trending in the right direction.

If you have any interest in XSD, or in XML schema languages in general, I hope you will take the time to read and comment on XSD 1.1. The comment period runs through 12 September 2008. The specs may be found on the W3C Technical Reports index page.

Thinking about test cases for grammars

[19 June 2008]

I’ve been thinking about testing and test cases a lot recently.

I don’t have time to write it all up, and it wouldn’t fit comfortably in a single post anyway. But several questions have turned out to provide a lot of food for thought.

The topic first offered itself in connection with several bug reports against the grammar for regular expressions in XSD Part 2: Datatypes, and with the prospect of revising the grammar to resolve the issues. When revising a grammar, it would be really useful to be confident that the changes one is making change the parts of the language one wants to change, and leave the rest of the language untouched. In the extreme case, perhaps we don’t want to change the language at all, just to reformulate the grammar to be easier to follow or to provide semantics for. How can we be confident that the old grammar and the new one describe the same language?

For regular languages, I believe the problem is fairly straightforward. You can calculate the minimal FSA for each language and check whether the two minimal FSAs are isomorphic. Or you can calculate both set differences (L1 – L2 and L2 – L1) and check that both of them are the empty set. And there are tools like Grail that can help you perform the check, although the notation Grail uses is just different enough from the notation XSD uses to make confusion and translation errors possible (but similar enough that you think it would be over-engineering to try to automate the translation).

Buyt for context-free languages, the situation is not so good. In principle, the equivalence of context-free languages is decidable, but I would have to spend time rereading Hopcroft and Ullman, or Grune and Jacobs, to figure out how to go about it. And I don’t know of any good grammar-analysis tools. (When I ask people, they say the closest thing they know of to a grammar analysis tool are the error messages from yacc and its ilk.) So even if one did buckle down and try to prove the original form of the grammar and the new form equivalent, the possibility of errors in the proof is quite real and it would be nice to have a convenient way of generating a thorough set of test cases.

I can think of two ways to generate test cases:

  • Generation of random or pseudo-random strings; let’s call this Monte Carlo testing.
  • Careful systematic creation of test cases. I.e., hard work, either in the manual construction of tests or in setting things up for automatic test generation.

Naturally my first thought was how to avoid hard work by generating useful test cases with minimal labor.

The bad news is that this only led to other questions, like “what do you mean by useful test cases?”

The obvious answer is that in the grammar comparison case, one wants to generate test cases which will expose differences in the languages defined by the two grammars, just as in the case of software one wants test cases which will expose errors in the program. The parallel suggests that one might learn something useful by attempting to apply general testing principles to grammars and to specifications based on grammars.

So I’ve been thinking about some questions which arise from that idea. In much of this I am guided by Glenford J. Myers, The art of software testing (New York: Wiley, 1979). If I had no other reasons for being grateful to Paul Cotton, his recommending Myers to me would still put me in his debt.

  • For the various measures of test ‘coverage’ defined by Myers (statement coverage, decision coverage, condition coverage, decision/condition coverage, multiple condition coverage), what are the corresponding measures for grammars?
  • If one generates random strings to use as test cases, how long does the string need to be in order to be useful? (For some meaning of “useful” — for example, in order to ensure that all parts of the grammar can be exercised.)
  • How long can the strings get before they are clearly not testing anything that shorter strings haven’t already tested adequately (for some meaning of “adequate”)?
  • From earlier experience generating random strings as test cases, I know that for pretty much any definition of “interesting test case”, the large majority of random test cases are not “interesting”. Is there a way to increase the likelihood of a test case being interesting? A way that doesn’t involve hard work, I mean.
  • How good a job can we do at generating interesting test cases with only minimal understanding of the language, or with only minimal analysis of its grammar or FSA?
  • What kinds of analysis offer the best bang for the buck in terms of improving our ability to generate test cases automatically?

Using our own tools / means and ends

[20 April 2008]

One often hears, in information technology and in spec development, the injunction to eat one’s own dog food, meaning to use, oneself, the technologies one is developing. By confronting the technology as a user, the designer will become aware sooner of flaws in the implementation, gaps in the functionality, or other usability problems. I hate the metaphor, because I don’t think of the technology I work on as dog food, but it’s good advice.

I’ve been thinking lately that I should be making a lot more use, in my work, of the technologies I’m responsible for. There is a radical difference between the attitude of someone who has designed a system, or engaged with it as an object of study, but never much used it, and someone who has actually used the system to do things they wanted to do, treating the system as a means not an end in itself.

I once sat in working group meetings listening to brilliant computer scientists telling us how the working group should approach various problems, and being struck by the fact that if they showed five XML examples of two to ten lines each, at least two of them would be ill-formed. They had a pretty good theoretical grasp of XML, although they didn’t grok its essence. But their mistakes showed clearly that they had never spent as much as thirty minutes writing XML and running it through a parser. It was no wonder that their view of markup was so different from the view of those of us who worked with markup during most of the hours of our working lives. To them, XML was an object of study. To those of us who actively used it, it was no less an object of study, but it was also a means to achieve other ends. I mentioned to one of the computer scientists that the well-formedness errors in his examples made it hard for me to take his proposals seriously, and to his great credit he did address the problem. He never showed an example in XML notation again; instead he wrote everything down in Haskell. (And since he used Haskell more or less constantly, he didn’t make dumb syntax errors in it.)

In practice, I think the ‘use your own tools’ principle means I should spend some time trying to upgrade my personal tool chain to make use of technologies like XProc and SML. (There are some other technologies I should make more use of, too, but I’m not prepared to identify them in public.)

At the same time, I’m also acutely conscious of the difference between experimental systems and production systems. Experimental systems need to be able to change rapidly and radically as we learn. Production systems need to be stable and reliable, and cannot usually change more quickly than the slowest user who relies on them. The maintainers of a production system seldom have the ability to force their users to upgrade or make other changes. (And if they succeed in acquiring it, they will be regarded by their users as tyrannical oppressors to be deceived and evaded whenever possible.)

Specs in development sometimes need to be experimental systems.

Some people will say no, never standardize anything that requires experimentation: only standardize what is already existing practice. The example that sticks in my head from long-ago reading on the theory of standardization is machine-tool tolerances: if no one sells or buys tools with a given tolerance, it’s because either there’s no market for them (so no need to standardize) or no understanding of how to achieve that tolerance (so a standard would be pointless and might get crucial bits wrong out of ignorance); standardize the tolerances people are actually using and you’ll produce a standard that is useful and based on good knowledge of the domain. This principle may well work for machine tools; I am not a mechanical engineer. But if you wait until a given piece of information technology is already common practice, then by and large you will be looking at a marketplace in the form of a monopoly with one player. If you’re looking for a non-proprietary standard to provide a level playing field for competing implementations, you’re too late.

In information technology, standards that go out in front of existing practice appear to be the only way to define a standard that actually defined a non-proprietary technology. Ideally, you don’t want to be too far out in front of existing practice, but you can’t really be behind it, either.

If you’re out in front of well established practice, the spec needs in some sense to be experimental. If the responsible working group finds a new and better way to say or do something, after issuing their second public draft but before the spec is finished, they need to be free to adopt it in the third working draft.

If I rebuild the tool chain for the XSD 1.1 spec to use XProc, for example, that would be interesting, the re-engineering would probably give us a cleaner tool chain, and it might provide useful feedback for the XProc working group. But when the XProc working group changes its mind about something, and the implementation I’m using changes with it, then my tool chain breaks, and not necessarily at a time when it’s convenient to spend time rebuilding it. (Long ago, my brother was trying to persuade me I should be interested in personal computers, which I regarded as toys compared to the mainframe I did my work on. Nothing he said made any dent until he said “Having a personal computer means you get to upgrade your software when you want to, not when the computer center finds it convenient.” That sold me; we bought a personal computer as soon after that as we could afford one.)

Is there a way to manage the tension between a desire to use the tools one is building and the need for the tool chains one uses in production work to be stable?

I don’t know; but I hope, in the coming months, to find out.

Caspar

[8 April 2008]

I had another bad dream last night. Enrique came back.

“I don’t think you liked Cheney very much. Or Eric van der Vlist, either, judging from his comment. So I’ve written another conforming XSD 1.0 processor, in a different vein.” He handed me a piece of paper which read:

#!/bin/sh
echo "Input not accepted; unknown format."

“This is Caspar,”

“Caspar as in Weinberger?”

“Hauser. As you can see, Caspar doesn’t have the security features of Cheney, but it’s also conforming.”

I should point out for readers who have not encountered the figure of Kaspar Hauser that he was a mysterious young man found in Nuremberg in the 1820s, raised allegedly without language or human contact, who never quite found a way to fit into society, and is thus a convenient focal point for meditations on language and civilization by artists as diverse as Werner Herzog, Paul Verlaine, and Elizabeth Swados.

“Conforming? I don’t think I want to know why you think so.”

“Oh, sure you do. Don’t be such a wet blanket.”

“I gather you’re going to tell me anyway. OK, if there’s no way out of it, let’s get this over with. Why do you think Caspar is a conforming XSD processor?”

“I’m not required to accept XML, right? Because that, in the logic of the XSD 1.0 spec, would impede my freedom to accept input from a DOM or from a sequence of SAX events. And I’m not required to accept any other specific representation of the infoset. And I’m also not required to document the formats I do accept.”

“There don’t seem to be any conditionals here: it looks at first glance as if Caspar doesn’t support any input formats.” (Enrique, characteristically sloppy, seems to have thought that Hauser was completely alingual and thus could not understand anything at all. That’s not so. But I didn’t think it was worth my time, even during a bad dream, to argue with Enrique over the name of his program.)

“Right. Caspar understands no input formats at all. I love XSD 1.0; I could write conforming XSD 1.0 processors all day; I don’t understand why some people find it difficult!”