Team work, specialization, fact-checking

[26 January 2012]

This week’s New Yorker has an interesting essay on brainstorming (doesn’t work, it says). It brought my evil twin Enrique running, waving his copy in the air. “Look at this. Look at this!” he shouted.

I looked at the passage he pointed out. Pursuing the observation that “like it or not, human creativity has increasingly become a group process”, the author quotes one Ben Jones, a professor at the Kellogg School of Management at Northwestern University, who has quantified the trend away from solo work and towards work in teams.

“‘A hundred years ago, the Wright brothers could build an airplane all by themselves,’ Jones says. ‘Now Boeing needs hundreds of engineers just to design and produce the engines.’”

“Well,” I said to Enrique, “no question that teams are bigger today.” “But …” he spluttered. “But what?” I said. “But Boeing doesn’t make engines.” “They don’t?” (I love to play dumb; it drives Enrique speechless with frustration. But he seems to be right. If I’m reading their Web site correctly, Boeing hasn’t manufactured an engine since 1968, and those weren’t aircraft engines in any case.) “But what makes the airplane go, then?” “GE makes engines,” Enrique snarled. “Rolls-Royce makes engines. Pratt and Whitney makes engines. Boeing makes airframes” (along with many other things, I hasten to add, none of them engines). “How can someone be interested in specialization and not know that?”

Didn’t the New Yorker use to have a fact-checking department?

Copyright and other unwelcome issues

[10 November 2011]

One of the unwelcome side effects of recent trends in copyright (I mean the gradual shift, over the last fifty years, towards more and more protection for commercial interests and less and less protection of the public benefit) is that while it used to be easy to make one’s own work readily available for reuse by others, it now requires more careful planning. It used to be, for example, that if you didn’t care to claim or protect copyright in something you wrote, all you had to do was nothing: if you didn’t claim copyright, and the work was public, then it was in the public domain.

[“Hmm. How sure of you are that?” asked my evil twin Enrique, with a suspicious look. “Well, kind of sort of sure, I think.” “Better add a disclaimer, then, don’t you think?” “OK, right you are.”]

At least, that’s how I understand it, at a first approximation. (I am not a lawyer and have never much wanted to be, though a friend of mine who did go to law school once told me I’d enjoy the mysteries and mystique of tax law. So no reader should take anything I write as providing guidance about the law of the U.S. or any other country.) If you want good information about copyright, go find something by Pamela Samuelson.

[“OK, that’ll do, I guess. Is Pamela Samuelson really a good source?” “The best. Thank heavens she’s writing that column for Communications of the ACM again.”]

Nowadays, of course, in the U.S. this is no longer true: from the moment you write anything down, you own copyright in it, whether you want it or not, unless you do something to avoid it.

Of course, many people ignore this and behave as if the old legal regime were still in place. I’ve had representatives of U.S. universities say “Oh, feel free to reuse that stylesheet we wrote” — as if, because it carried no copyright statement, it were available for reuse by anyone interested. On the contrary! Since the stylesheet didn’t carry any licensing information or dedication to the public domain, it was certainly copyright either by the individual who wrote it or by the institution for which it was written. And since it didn’t carry any copyright information, it was impossible to know with any confidence who actually did (or does) own the copyright, and whom to contact for permission.

[“And these people who were trying to give you permission to reuse that stylesheet, were they legally empowered to enter into binding agreements on behalf of their institutions?” Enrique asked. “Dunno. I doubt it.” “So, tell me, do you have a barge pole handy?” “A barge pole? No, why?” “Because I want to warn you not to touch that code with a barge pole, that’s why. And if you don’t have a barge pole, it just feels kind of pointless. Do you have an eleven-foot pole, maybe?” “Oh, hush.”]

End result: I politely ignored (at least, I hope my silence was polite) their invitation to reuse that code, and I wrote new code from scratch.

[“Oh, come on,” Enrique hissed. “You know perfectly well you wouldn’t have reused that code anyway! It was full of xsl:for-each elements.” (New readers may need to be informed that I seem to have an issue with xsl:for-each elements; I’m sure it’s a perfectly fine construct and there’s nothing wrong with it. I only know that when I have a stylesheet with a bug and discover that it has for-each constructs, rewriting the for-each as an apply-templates always seems to make the bug disappear. Go figure.) “Well, yeah. But even if I had loved the code, I would not have felt able to reuse it.”]

Of course, there are plenty of open-source and Creative Commons licenses to choose from, if you want to ensure that work you do can be re-used.

But who, in a collaborative project, is “you”?

If you write code or prose as an individual, outside the course and scope of your normal employment duties, then it’s straightforward to assert copyright in your own name. But if you are collaborating with others in a project, and you want to apply an appropriate license, in whose name should copyright be claimed? If only one person works on a given item (a program or a document) it’s easy to say that person should assert copyright and grant the license. But if more than one person works on it?

Some people incline to claim copyright in the name of the project, which feels plausible at some level: project is a name we sometimes give to the intentional collaboration of individuals to achieve some goal, and work done in furtherance of that goal can plausibly said to be done for “the project”.

But can a project which is not a legal entity actually be the owner of a copyright? If there’s a legal entity involved, it’s possible in principle to figure out, in case of disputes, who speaks for the entity and who makes decisions. But if there’s no legal entity?

Can copyright usefully be claimed by a research project, in the name of the research project?

[“Well, wouldn’t a research project be legally a form of partnership?” asked Enrique. “A partnership doesn’t have to be incorporated to be a legal person, right?” “Maybe,” I equivocated. “But remember, I am not a lawyer. And a fortiori you, as a figment of my imagination, are also not a lawyer.” “Oh, go soak your head. Whom are you calling a figment … ?”]

I notice that W3C, for example, which is not a legal entity, claims copyright in the name of W3C, but immediately after adds, in parentheses, the names of the three host institutions of W3C, which are legal entities.

It would be nice, wouldn’t it, if intellectual property rights served to promote the useful arts and sciences, instead of being an unproductive drain on the time and effort of creative people and a barrier to normal intellectual work? Oh, well, maybe someday.

Day of the dead 2011

7 November 2011

Last week’s celebration of the Day of the Dead (aka All Souls’ Day, 2 November) was a little more thoughtful for me than it is in some years. Partly this was because John McCarthy had just died, and partly because this year seems to have taken an unusually high toll in people whose work I have had occasion to value.

News of McCarthy’s death came through when I was on the phone with John Cowan and my brother Roger Sperberg. We paused for a few moments, and then we spent half an hour thinking about technical topics, which seemed like a good way to mark the occasion. (For example: if the original plan was for Lisp programs to be written not in S-expressions but in an Algol-like syntax called M-expressions, is that a sign that McCarthy was less far-sighted than he might have been? How could he not have seen the importance of the idea that Lisp data and Lisp programs should use the same primitive data structures? Perhaps he had feet of clay, so to speak? Or on the contrary should we infer, from the fact that the plan for M-expressions was abandoned and that Lisp became what it became, that McCarthy was astute enough to recognize great ideas when he saw them, and nimble enough to change his plans to capture them? On the whole, I guess I lean toward the latter view.)

This year, Father Roberto Busa also died. Many people (including me) regard him as the founder of the field of digital humanities, because of his work, beginning in 1948, on a machine-readable text of the work of Thomas Aquinas. The Index Thomisticus was completed in 1978, several IT revolutions later. Busa, too, was astute enough to adjust his plans in mid-project: his initial plans involved clever use of punched cards and sorters, and it was only after the project had been going for some years that it began to use computers instead of unit-record equipment. I met Busa only briefly, once as a young man at my first job in humanities computing, and once years later when I chaired the committee which voted to award him what became the Busa Award for contributions to the application of information technology to humanistic scholarship. But he made a strong impression on me with his sweetness of temper and his intelligence. He made an even stronger impression on me indirectly: Antonio Zampolli worked with Busa as a student. And without Antonio, I think my life would have had a rather different shape.

Oh, well. Nobody gets out of here alive, anyway.

Another reason to use the microphone

[Hamburg, 29 September 2011]

Every now and then conference speakers want to avoid using a microphone; they dislike the introduction of technology into the speaker/audience relation, perhaps, and sometimes they are so confident of their ability to be heard in the room that any suggestion that they might use a mike is almost an affront to their lung power. (Are these last class of speaker always male? Well, usually, I think.)

I have been told on good authority that users of hearing aids benefit a good deal from amplification of the speaker’s voice; that’s a good reason to use the microphone.

But sitting here listening to a very interesting speaker who is completely ignoring the microphone, I am reminded of a different reason: for purposes of speaker amplification, non-native speakers are effectively hard of hearing. When the speaker strays into range of the podium’s microphone and happens to be facing the audience, I can understand every word he says; when he faces away from the audience or wanders over to the side of the room, I am missing at least every fifth word, which makes the talk into a kind of aural cloze test. That’s OK for me (I pass the test, more or less, though I missed that nice joke everyone else laughed at). But for my neighbor (for whom German is not a second but a fourth or fifth language), the experience is clearly a real trial.

If you are attending an international conference and want to be understood by people who are not native speakers of your language, then there is a simple piece of advice:

Use the microphone.

Enough said.

XQuery in the cloud

[10 August 2011]

Recently I had occasion to build a small web application (feedback forms for the Balisage conference) using XForms. I used XForms since XForms delivers the information from the user in an XML document, which makes it easier for me to work with the data later. As an experiment, I developed the app using Sausalito, the XQuery engine in the cloud developed by 28msec. Quick summary: Cool! Thumbs UP!

[Obligatory hand-waving and disclaimer: Sausalito is not the only way to deploy XQuery in the cloud: MarkLogic has defined Amazon machine instances with MarkLogic Server pre-installed, and I’m sure there are, or will be, other options as well. I will continue to make a point of working with as many different XQuery implementations as I can, just to know what’s out there. But I had a lot of fun with Sausalito, and if you have a use for a Web-based XML application, Sausalito is definitely worth a look.]

The basic structure of a Sausalito project is fairly straightforward, and well documented on their site: the URIs you want to serve are matched either against static resources in a public subdirectory of the project, or against a directory of XQuery modules containing handlers for requests. For example, in the Balisage feedback application, the URI /reviews/single is handled by the single() function in the module reviews.xq; it can call library functions defined elsewhere. Sausalito has all the functions usual in XQuery, and also some fairly extensive libraries of things you may want for web applications (to query aspects of the incoming HTTP request, for example, and to set properties in the response). They have an Eclipse-based IDE that’s reasonably nice (though I still missed Emacs from time to time), and also a command-line interface (so I can shift to that and use Emacs, if I want to).

Unsurprisingly, I found it very pleasant to be able to write the core of the application in XQuery, with no Javascript, returning XML to the browser and using XSLT and CSS to render it there. What did surprise me a little, because I had not expected it, was the exhilarating speed with which I was able to move from idea to deployed application. I’ve deployed XForms applications on the Web before, and I have an eight-point checklist for setting up a WebDAV server using Subversion and Apache. It’s not particularly difficult or strenuous, but it’s tedious and takes few hours each time I have to do it. And developing the checklist was very painful; it took a long time to find configurations that worked for me, in the environment provided by my service providers.

The developer configures a collection of documents in Sausalito by declaring the collection:

declare ordered collection my:docs as node()*;

Then they deploy the application. And it’s … just … there. Instant gratification, or as close to instant as your network latency and bandwidth will allow.

As I wrote to the developers at 28msec:

I’m … very taken with the convenience of deploying to the cloud; having an XML database on demand is a lot like having running water on demand — those who have never had it may think it’s a luxury anyone should be able to live without, but once you’ve had it, it can be hard to go back.