[7 April 2009]
Bob Sutor asks, in a blog post this morning, some questions about government funding and open source software. Since some of them, at least, are questions I have thought about for a while as a reviewer for the National Endowment for the Humanities and other funding agencies, I think I’ll take a shot at answering them. To increase the opportunity for independent thought to occur, I’ll answer them before I read Bob Sutor’s take on them; if we turn out to agree or disagree in ways that require comment, that can be separate.
- When a government provides funding to a research project, should any software created in the project be released under an open source license?
In practice, I think it almost always should, but in theory I recognize the possibility of cases in which it needn’t.
When I review a funding proposal, I ask (among other things): what is the quid pro quo? The people of the country fund this proposal; what are they buying with that money? A reliable survey of the work of Ramon Llull and its relevance to today? Sounds good (assuming I think the applicant is actually in a position to produce a reliable survey, and the cost is not exorbitant). A better tool for finding and studying emblem books? Insight into methods of performing some important task? (Digitizing cultural artefacts, archiving digital research results for posterity, creating reliable language corpora, handling standoff annotation, … there are a whole lot of tasks it would be good to know better how to do.) How interesting is what we would be learning about? How much are we likely to learn?
My emphasis on what we get for the money sometimes leads other reviewers or panelists to regard me as cold and mean-hearted, insufficiently concerned with encouraging a movement here, nurturing a career there. But I have noticed that the smartest and most attractive members of panels I’ve been on are almost always even tougher graders than I am. When funds are as tight as they typically are, you really do need to put them where they will do the most good.
If the value proposition of the funding proposal is “we’ll develop this cool software”, then as a reviewer I want the public to own that software. Otherwise, what did we buy with that money?
If the value proposition is “we’ll develop these cool algorithms and techniques, and write them up, so the community now has better knowledge of how to do XYZ — oh, and along the way we will write this software, it’s necessary to the plan but it’s not what this grant is buying”, then I don’t think I want to insist on open-sourcing the software. But it does make it harder for the applicant to make the case that the results will be worth the money.
Stipulating that software produced in a project will be open-source does usually help persuade me that its benefit will be greater and more permanent. If the primary deliverable I care about is insight, or an algorithm, open-sourcing the software may not be essential. But it helps guarantee that there will be a mercilessly complete account of the algorithm with all details. (It does have the potential danger, though, that it may allow other reviewers or the applicants to believe that the source code provides an adequate account of the algorithm and there is no need for a well written paper or series of papers on the technical problem. I am told that some programmers write source code so clear and beautiful that it might suffice as a description of the algorithm. I say, if writing documentation as well as source code is good enough for Donald Knuth, it’s good enough for the rest of us.)
On the other hand, I don’t think deciding not to open-source the software is necessarily an insuperable barrier. The question is: what value is the nation or the world going to get from this funding? Sometimes the value clearly lies with the software people are proposing to develop, sometimes it clearly lies elsewhere and the software plays a purely subordinate, if essential, role. (But although I admit this in principle, I am not sure that in practice I have ever liked a proposal that proposed to spend a lot of effort on software but not to make it generally available. So maybe my generosity toward non-open-source projects is a purely theoretical quantity, not observable in practice.)
If software is involved, you also have to ask yourself as a reviewer how well it is likely to be engineered and whether the release of the software will serve the greater good, or whether it will act like a laboratory escape, not providing good value but inhibiting the devotion of resources to creating better software.
The chances and consequences of suboptimal engineering vary, of course, with whether the research in question is focused specifically on computer science and software engineering, or on an application domain, in which case there is a long and often proud history of good science being performed with software that would make any self-respecting software engineer gag. (A long time ago, I worked at a well known university where the music department burned more CPU cycles on the university mainframe than any other department. Partly this was because Physics had its own machines, of course, and partly it was because the music people were doing some really really cool and interesting stuff. But was it also partly because they were lousy programmers who ran the worst optimized code east of the Mississippi? I never found out.)
- Does this change if commercial companies are involved? How?
If the work is being done by a commercial company, they are historically perhaps less likely to want to make the software they develop open-source. That’s one way the process is affected.
But also, if a government agency is contracting with a commercial organization to develop some software, there may be a higher chance that the agency wants some software for particular parties to use, and the main benefit to be gained is the availability to those parties of the software involved. In some cases, the benefit may be the existence of commercially viable organizations willing and able to support software of a particular class and develop it further.
There are plenty of examples of commercial codebases developed in close consultation with an initial client or with a small group of initial clients. The developer gets money with which to do the development; the initial clients get to help shape the product and ensure that at least one commercial product on the market meets their needs. In the cases I have heard of, the clients don’t typically turn around and demand that the code base be open-source.
It’s not clear to me that government funding agencies should be barred from acting as clients in scenarios like this. This kind of arrangement isn’t precisely what I tend to think of as “research”, but whether it’s appropriate or not in a given research program really depends on the terms of reference of that program, and not on what counts as research in the institutions that trained me.
I have been told on what I think is good authority that if it had not been for contracts let by various defence agencies, the original crop of SGML software might never have been commercially viable. (And since it was that crop of software that demonstrated the utility of descriptive markup and made XML possible, I wouldn’t like to try to outlaw whatever practices led to that software being developed.)
- Does this change if academic institutions are involved? How?
I don’t think so.
- How should the open source license be chosen? Who gets to decide?
Two umbrellas and a prime number.
I think I mean “Huh?” Is this a trick question?”
To the extent that we think of funded research as the purchase (on spec) of certain research products we hope the funding will produce, then the funding agency can certainly say “We want the … license”. And then the Golden Rule of Arts and Sciences applies. Or the people writing the proposal can say “We want to use the … license; take it or leave it.” And the funding agency, guided by its reviewers and panelists and staff and the native wit of those responsible for the final decision, will either leave it or take it.
The only thing that would make me more suspicious and worried than this chaotic back and forth would be an attempt to make an iron-clad rule to cover all cases, for all projects, for all governmental funding agencies.