Now THAT’s a birthday present!

To mark the tenth anniversary of the XML Recommendation, Tim Bray has resurrected an account he wrote ten years ago of various people involved in the pre-history and creation of XML.

Well worth reading, whether you were there and are looking for an excuse to spend half an hour on nostalgia, or you weren’t there and wonder what it was like. Of course, there is no single “what it was like”: it was like different things from different vantage points. My memories of the initial development of XML are a lot longer on technical discussions and a lot shorter on memorable dinners with movers and shakers.
Eve Maler has marked the tenth anniversary of the XML spec by posting an online copy of the book she and Jeanne El Andaloussi wrote on vocabulary development: Developing SGML DTDs: From Text to Model to Markup.

There is a huge body of knowledge, craft, and/or art about document analysis, vocabulary design, and the use of markup in systems that went into the design of SGML and XML. (Some call it “the SGML methodology” as opposed to “SGML” or “the spec”.) Almost all of it circulates largly in oral tradition; Maler/El Andaloussi was for a long time the only, and is still one of the best, attempts to write it down.

Thank you for the birthday present, Eve!

Posted in XML

Tim Bray on XML People

To mark the tenth anniversary of the XML Recommendation, Tim Bray has resurrected an account he wrote ten years ago of various people involved in the pre-history and creation of XML.

Well worth reading, whether you were there and are looking for an excuse to spend half an hour on nostalgia, or you weren’t there and wonder what it was like. Of course, there is no single “what it was like”: it was like different things from different vantage points. My memories of the initial development of XML are a lot longer on technical discussions and a lot shorter on memorable dinners with movers and shakers.

Another plug for XML Catalogs (and caching)

The W3C systems group posted a blog entry the other day about the caching of DTDs and schemas. The failure of some XML software to use caches wisely is causing unbelievable amounts of traffic on the W3C site: in some cases, the same IP address is requesting the same DTD file hundreds and thousands of times in the space of a few hours.

The blog has good pointers to resources about using HTTP caching well, and about XML Catalogs.

I’ve said it before, and I’ll say it again: every piece of software that works with XML ought to use XML Catalogs. By all means allow the user to turn it off, but support it, and turn it on by default. The main reason is: it makes the life of your users easier. And the kind of problem discussed by the systeam blog post is one more reason.

Devil ain’t got no place to play ’round here

[8 February 2008]

Idle hands, they say, are the Devil’s playground.

Poor Devil. No playground around here these days.

After the XML Schema working group buckled down at our face to face meeting week before last, and worked through all of the Last-Call issues raised against the Structures spec, the WG took a week without a meeting, to allow the editors time to get some proposals done. My co-editor Sandy Gao, whose name will someday be in reference books as a cross reference from “Stakhanovite”, produced wording proposals to close 42 open issues against Structures. In the same time, I managed a proposal to resolve one Structures issue. (Or rather, to resolve it in part.) On the Datatypes side, we didn’t manage quite such a glorious record, but we did manage proposals for eleven issues.

OK, yes, sure, many of the issues we had proposals for were simple to fix: typos, small tweaks to wording, and so on. That’s one of the reasons for doing triage: to have a list of easy items and low-hanging fruit. But still, that’s a lotta issues to close. Drafting and reviewing other editors’ drafts occupied most of my time last week, and this week most of my Schema time went to the mechanical task of just getting the proposals shipped to the working group.

I could have made lots of klog entries, but they would all have read

  • 2008-01-30: finished draft for bug 2947.
  • 2008-01-30: finished draft for bug 3265.
  • 2008-01-30: finished draft for bug 3256.
  • 2008-01-30: finished draft for bug 4839.
  • 2008-01-30: finished draft for bug 4089.

Today the working group adopted almost all of the proposals; they bounced one back and accepted the wording for one issue as a “partial” resolution, with the request that the editors try to revise it one more time, to get it a little better before we close the issue entirely. But we closed a lot of them.

So since this morning’s WG call, I have been generating new copies of the status-quo documents and updating Bugzilla records.

This is not purely mechanical work, but it’s mechanical enough that I’ve had time to think, now and then, while waiting for the server, about the significance of the two tasks.

Keeping a document that shows the current status quo

One of the best decisions I made when I began to work as an editor on XSD 1.1 was the decision to attempt, always, to ensure that decisions made by the WG during a telcon were reflected before the next WG meeting — and preferably before close of business that same day — in a copy of the spec kept in a stable location. This has turned out, I think, to be a useful technique.

I didn’t always feel this way. In 1997 it irritated me profoundly that Dan Connolly used to complain whenever there were decisions the XML working group had made that he didn’t see reflected in the most recent draft of the XML spec he could find. But while I’m still not sure I think his reasons were sound, I have come to believe that his conclusion was correct. (Maybe I didn’t understand his reasons propertly.)

  • It makes it easier for WG members to see where we are.

It makes it easier for other WGs, too, although other WGs seem unaccountably leery of looking at the current status quo document instead of at public working drafts on the /TR page. I suppose they feel justifiably that there can be such a thing as Too Much Information.

  • It means that WG decisions have visible effects immediately (or, as immediately as is compatible with my having to regenerate the spec and check in the new copies, when tends to take a few hours), which is good for morale.

I at least find it alienating and depressing to work hard making decisions in a WG and find, months later, that the editors have still not gotten around to making the text of the spec reflect those decisions. Experience also shows that if much time elapses between decision and revision of the spec, the editors end up forgetting what the WG decided, or conveniently rewriting it in their memory.

  • Constant re-publication (even if it’s only in the member-only portion of the site) also helps keep the document production system in good trim. Problems get found and fixed sooner, and debugging is simpler: fewer problems around means it’s less likely that one problem will be interacting with or masking another.
  • Constant re-publication (even if it’s only in the member-only portion of the site) also helps keep the document in good trim. If I link-check and validate the status quo regularly, publishing a working draft becomes a simple task instead of requiring herculean efforts to fix HTML- and link-validity errors.

We haven’t always managed to achieve the same-day goal; sometimes the spec gets a week, or two weeks, out of date. And on a couple of occasions when outside circumstances were unfavorable, the Structures spec got as much as four months out of date. Bringing the status-quo documents up to date after that kind of interval was a nightmare — a suitable punishment, perhaps, for letting them get so far out of date in the first place.

Bugzilla as issue tracking system

Keeping the issues list up to date has many of the same advantages, although as a WG we have not been nearly so good about that as about keeping the ‘current editors’ copy’ of the spec up to date. And of course, like many working groups we have trouble providing quick response. Most of the bugs we fixed today were reported months ago, some, well, longer ago than that. We need to do better about that; it would help keep the number of issues from climbing so high in the first place.

And if being slow to get back to the reader has a bad effect on the WG’s relation to the wider community, being slow to update the status of issues, once the WG acts on them, has an even worse effect on the WG’s relation with itself. When the issues list is out of date, you can’t look at any issue without the risk that you’re wasting time re-considering something the WG actually already decided once. Or, just as likely, you risk the WG saying “Wait, we don’t need to look at this one, we did it a few months ago” when what they remember is that it was discussed a few months ago, without remembering that it was not resolved a few months ago.

So I have come to believe it’s very helpful to get the issues list updated right away after a meeting. Or preferably during the meeting, but the XML Schema WG has never quite assimilated that idea.

For a long time, we maintained our issues list in XML, using a series of ad hoc vocabularies. Because they were in XML, it was possible to do a lot of nifty things with them; we had stylesheets to indicate status via foreground and background color, we had a good fit between the data and the information we want to maintain about issues, and so on.

But maintaining the issues list in an XML document on the server meant that only WG members with CVS access could update it. And keeping the list in a single XML document meant, in practice, that only one person at a time could (or maybe I mean would) do so. And that person invariably became a bottleneck.

Bugzilla is not well suited to the task of of issue tracking for a working group developing a spec. It’s designed for tracking software bugs, not design issues or spec defect reports; its workflow doesn’t match what any of my WGs does; its terminology is sub-optimal; its notion of text has all the power and charm of ASCII email (which is particularly grating for WGs working with XML technology: we know how much better things can be with good markup!).

But it’s got a convenient Web interface and more than one WG member can be updating things at a time. And I have come to believe that those two facts trump all others. Someday I’ll get around to designing and implementing an XML-based issue tracking system for working groups; it will have a Web interface and suitable markup, and it will be fun to develop.

In the meantime, we use Bugzilla.

So it was with Bugzilla that I spent most of my afternoon, while make and XSLT and CVS checkins went on in the background.

Marking an issue resolved can take time. Since the status-quo version of the spec is not publicly accessible, whenever the comment came from someone outside the WG who doesn’t have member access to the W3C site it’s necessary to transcribe the wording we finally adopted into the bug record, so the originator of the comment actually has a way to tell whether we got it fixed or not. And if the bug applies both to XSD 1.0 and to XSD 1.1, our decision today resolves it only for the latter, so a new issue needs to be raised for 1.0, otherwise we’ll lose track of it.

Two pieces of advice, then, for those using Bugzilla to maintain issues lists:

  • Learn to use the “Change several bugs at once” feature.
  • Learn to use the “Clone this bug” feature.

Enough said.

There’s a final reason that marking issues resolved can be slow going. Because these are Last-Call comments, the working group is responsible for keeping an audit trail to show that all comments have been dealt with, and to document that each person who raised an issue has been asked explicitly whether the WG’s action on the issue has satisfied their concerns.

If they’re not satisfied, the director of the consortium reviews the question when the time comes to move the spec forward. Woe then to the working group who didn’t make an effort to satisfy those who commented on its drafts. If the chair, or the staff contact, can explain plausibly what point is at issue and what the WG has done to try to accommodate the reader who raised the issue and why it’s not possible to do anything more to resolve the comment without breaking something more important, then the director may well sustain the WG in its decision. I’ve seen that happen.

But when the working group did not in fact try very hard to resolve the comment, and all you have to say is “well, we didn’t want to do that”, or “no, we didn’t want to consider that”, then you’re in for a long afternoon of sharp, skeptical questions and there’s a real possibility that the spec will be sent back to the working group so that more time can be spent resolving the open points of dispute. I’ve seen that happen, too.

It’s hard for some WGs to remember, but the goal is not to achieve consensus within the working group. The goal is to achieve consensus within the Web community as a whole.

And that, in the end, is what the whole exercise is about.

Applescript, so close and yet so far

[2 February 2008]

There are lots of big things on my mind lately: papers due and overdue and long overdue, submissions deadlines coming up, and a long long list of things to fix in the XSD 1.1 spec.

But there are some little things that refuse to stop taking up time and energy.

Years ago, tired of the hassles of trying to synchronize desktop and laptop, I followed the example of my friend Willard McCarty and started using my laptop as my only machine. This has worked pretty well on the whole, though it has saddled me with heavier laptops than some of my friends carry and given me less disk space than I could have had on desktop machines bought for the same price.

But a key part of making this work is having an external keyboard to use at my desk. I use a wave-shaped keyboard from Logitech at my desk, and to make things work as I expect, I use the Mac System Preferences interface to switch the Option and Command keys when I’m using the external keyboard.

Unfortunately, when I’m using the Powerbook’s own keyboard, this system preference must be undone. And then when I return to my desk, I have to switch the keys again.

Changing the relevant keyboard settings takes seven or eight mouse clicks. That gets old. I’d like to automate it; can Applescript help? Yes, it can: the samples include at least one example of scripting a change to the system preferences.

So I spent some time the other day trying to script my task: one script to launch System Preferences, choose Keyboard and Mouse, choose Modifier Keys, switch Command and Option, choose OK, and quit; another to go the other way.

The documentation makes fairly clear that I need to know the names for buttons and subpanes and so on provided by the application, so I can tell Applescript which things to activate. But I seem to be missing a step; I can’t find anything that tells me what names System Preferences gives to its panes. There’s an Open Dictionary option in the Script editor, but the dictionary for System Preferences only tells me that it defines things called panes. It doesn’t tell me — or am I just missing something here? — what IDs those panes have, or how to find out.

At the moment, this task is out of time and is going back to the bottom of the to-do list. But every time I take my machine away from my desk, or bring it back, I’m reminded that I haven’t solved this one yet.