Why …

Why do so many proponents of new technologies spend so much time misrepresenting existing technologies and spreading misinformation about them?

Do they misrepresent the existing technologies in order to make the new technology they are selling look better?

Or did they get involved in the new technology only because they failed to understand the existing technology and how to use it well?

Daniel Boone meets the consistent Web

[22 July 2008]

My colleague Thomas Roessler writes:

[The monotonic semantics of RDF] guarantee that you won’t run into a world of inconsistency when you discover additional information, and they also guarantee that you can learn things about the world piece by piece.

My evil twin Enrique responds: So let us start with the information that “The individual denoted by http://www.w3.org/People/cmsmcq/2008/ns1#joe is identical to the individual named http://www.w3.org/People/cmsmcq/2008/ns2#Josephus”, which I assume I can express using some predicate like the OWL sameAs.

And now let us discover additional information in another triple store, which contains the information that “The individual denoted by http://www.w3.org/People/cmsmcq/2008/ns1#joe is distinct from the individual named http://www.w3.org/People/cmsmcq/2008/ns2#Josephus”, which it expresses using some predicate like the OWL differentFrom.

I’m having trouble understanding (concludes Enrique) how we can do this without either running into a world of inconsistency (a small world, perhaps, bounded in a nutshell, but still a world big enough for joe and Josephus to be both the same and different), or else running into a world in which we find that “inconsistency” has been defined to have a highly technical meaning under which the two triples just described are not actually inconsistent in the technical sense (why do I expect someone to start lecturing me about Herbrand models any moment now?), even though any application relying on the usual notions of identity and difference may find itself at a loss as to what to make of seeing them both in the same graph.

I reminded Enrique of the American pioneer Daniel Boone, who proudly claimed that he had never been lost in his life. Never? Never. [Pause.] “But I was a mite bewildered once for three days.” [Rimshot.]

Balisage offers hope for the deadline-challenged

In my never-ending quest to help those who, like myself, never get around to things until the deadline is breathing down their necks, I have until now avoided mentioning that Balisage, the conference on markup theory and practice, has issued a call for late-breaking news.

The deadline for late-breaking submissions is 13 June 2008. It is now officially breathing down your neck.

There, will that do the job?

The 13th of June is this Friday. Just enough time to write up that great piece of work you just did, but not long enough to make a huge big thing of it and get all worked up in knots.

Balisage is an annual conference on markup theory and practice, held in early August each year in Montréal. Well, I say annual, but strictly speaking this is Balisage’s first year. The organizers have in the past been involved in other conferences in Montreal in August (most recently Extreme Markup Languages), and we regard Balisage as the natural continuation. So if you have always wanted to go to the Extreme Markup Languages conference, and are disappointed to see no announcements this year for that conference, come to Balisage. I think you’ll find what you’re looking for.

The full call for late-breaking news, and details of how to make a submission, are at http://www.balisage.net/latebreaking-call.html.

XML catalogs vs local caching proxy

[21 May 2008]

I have a senior colleague who has maintained, for several years, that SGML and XML catalogs are a deplorable special-case hack for a problem that should be solved by the more general means of HTTP caches. (Most recently, he was arguing against a proposal that the W3C distribute convenient packages of our most frequently used DTDs and schemas, with a catalog to make them easy to use. How someone so smart can be so deeply wrong-headed, I’m not sure.)

So when I had a network outage the other day that made it hard to get any work done, I thought about setting up a local caching proxy. Why did the outage make it hard to get anything done? Because I do use some software that doesn’t support catalogs, and which reacts to network outages by imposing a thirty-second delay for each DTD fetch (while its network request times out) and then proceeds anyway, without the DTD. Since it does proceed eventually, I can in fact build a new HTML version of the XSD spec (for example); it’s only that the process becomes painfully slow (or rather, even slower and more painful than usual).

But, I thought, the systems guys assure me that it’s not really hard for a user (not the system administrator, just a user) to set up a local caching proxy. So I’ll give it a try.

The upshot so far is: yes, it’s possible, though I wouldn’t call it easy. And managing catalogs still seems an order of magnitude easier and more straightforward. Here’s what I’ve done so far:

1 Apache ships with Mac OS X, it’s already running on my system (I use a local CGI script to log where my time goes), and mod_proxy enables it to serve as a local caching proxy. So I decided to try that, instead of installing squid or something similar. Found instructions for configuring Apache as a local caching proxy on a Mac OSX site; they worked (although they suggest commenting out the line “Deny all”, in the mistaken belief that otherwise nothing works). I followed his advice and blocked a couple of random sites I can live without, in order to be able to request them and tell, from the resulting failure message, that the proxy service was working.

2 In System Preferences / Network / Airport / Proxies, I told the system to use http://localhost:80 as a proxy for HTTP requests.

I had illusions that this was it. At the system level (I fantasized), outgoing HTTP requests would be re-routed to the local Apache.

Ha.

This does suffice for Safari, and possibly for other Apple software (I don’t know, haven’t looked, don’t much care right now). But Firefox must be told separately about the proxy server. And Opera.

And the command-line tools that were the main reason I wanted a caching proxy in the first place? RXP, libxml, Saxon, and so on? Nope, not using the proxy.

3 After some disappointing experiences with the documentation for the tools I’m using (none of the documentation I found says anything at all about how to tell the software to use a proxy server), I learned from oblique references somewhere that setting the environment variable http_proxy works for some Unix tools.

So I tried export http_proxy=http://localhost:80 and curl, at least, started using the proxy server. libxml (and thus xmllint and xsltproc) also started using it, I think, or trying to, but the main symptom of this success was that they started emitting error messages informing me helpfully that

error : Operation in progress

When I stopped Apache, that message went away. When I unset the http_proxy environment variable, it also went away (whether Apache was running or not).

4 Along about this time I decided just to make libxml use my local catalogs. This turned out to be harder than I thought: setting XML_CATALOG_FILES=/Library/SGML/Public/catalog.xml elicited only the laconic message from xsltproc, xmllint, xmlcatalog, and anything else that uses libxml: /Library/SGML/Public/Misc/catalog.xml:0: Catalog error : File /Library/SGML/Public/Misc/catalog.xml is not an XML Catalog.

But of course it is an XML catalog. I can see that.

I validated it, just to make sure, using both xmllint and rxp. No problems.

5 Eventually, it became clear that libxml wanted an explicit namespace declaration in the root element. (I had been relying on the default value given in the DTD.) So <catalog> had to become <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> in all my XML catalogs. (DV, are you listening? The Namespaces Rec is quite explicit that namespace declarations may be defaulted by the DTD. Otherwise I never would have voted for it. RXP gets it right; thank you, Richard!)

6 Eventually, the sick minds of Liam Quin and John Snelson suggested that perhaps I should try a different value for http_proxy: instead of http://localhost:80 I should try export http_proxy=http://127.0.0.1:80/. This eliminated the “error : Operation in progess” messages.

So I now have a local caching proxy working, and some of my tools, at least, are using it when they don’t find what they need using catalogs. I’ll assume that this is a Good Thing. But nothing I’ve seen so far tells me how to configure Apache (or squid, or any other proxy) the way I want to. I want a convenient list of the resources in the local cache, and I want to be able to mark some of them (e.g. the DTDs and the W3C stylesheets I use most often) as “Never ever delete this; ALWAYS have a copy handy; check every few months to see if it needs updating.” From the documentation of Apache and of Squid, I am inclined to believe this is not actually possible. At the very least, it’s not obvious. By default, Apache’s mod_proxy appears to plan to delete everything after 24 hours regardless of its expiration date. And the default size of the cache appears (can this possibly be?!) to be 5 KB.

So so far, the caching proxy does not give me the guarantees I want, about always having the resources I care about available, network or no network.

For catalogs, on the other hand, it would be nice to have some software that would augment the catalog with information about when a particular copy of the resource was fetched, when it was last modified, what its expiration date is (if the server provides one; surprising how few Web servers actually provide useful expiration information), and would check the Web periodically (say, once a month or so) to see whether any of my local copies of Web resources should be updated.

My interim conclusion is: both catalogs and HTTP caches could use improvement. As a way to ensure that the work I want to do can proceed without the network, however, catalogs are a lot more convenient, straightforward, and functional.

A kind of hommage

[12 March 2008]

I’m traveling to Germany, so I had an overnight flight last night. A long time ago I swore off red-eye flights from California to the East Coast, but for travel to Europe there seem to be no alternatives. Arriving in Amsterdam and then taking the train further seemed like a civilized way to organize things: it doesn’t make that much difference in the final arrival time, and trains are almost always more comfortable to travel in than airplanes.

I forgot to ask whether there were direct connections. There aren’t.

I had dimly imagined climbing into a train, finding a seat, and traveling further in a half sleep for a few hours before being deposited at my destination. The wait in Amsterdam seemed normal enough — I like Schiphol, it feels like one of the more civilized of European airports I travel through. But the change in Utrecht was a bit of a strain, and by noon and my second change of trains in Duisburg I had begun to feel as if I were in a surrealist movie.

Somewhere along the way, by a process that seems to have resembled automatic writing, the following account of an encounter with my evil twin appeared in my notebook.

My evil twin dropped by again the other day. He was not happy that in a couple of recent posts I had given him the pseudonym “Skippy”.

“Skippy!? What kind of name is that for an evil twin? I want a new name.”

“Look, I’m sorry if you didn’t like it. I was in kind of a hurry and it was the name that came to mind.”

“‘Skippy!’” He sulked a little more. “It sounds like the kind of evil twin George H. W. Bush would have.”

“Well, exactly,” I said. “It’s a reference to Garry Trudeau’s strips about Bush 41. Think of it as a kind of hommage.”

“Hommage, hell, you just got lazy. I don’t want to be Skippy, even as a cover name. Call me … Enrique.”

“Enrique.”

“Yes. Don’t you recognize it? It’s a reference to the Incredible Zambini Brothers.”

“Who? Sounds like an obscure San Francisco band.”

“No, you’re probably thinking of the Sons of Champlin. But it’s true, the Zambini brothers did have a cult classic once — The Incredible Zambini Brothers, All-Stars Again. It was a kind of combination jam session, cookbook, literary anthology, and football playbook. Riveting, really, if you have the right kind of chemical enhancements. So yeah, think of it as a kind of hommage.”

I have got to start getting more sleep when I do trans-Atlantic flights.