[18 August 2009]
Here’s a concrete example of the difference between the metadata-aware search we would like to have, and the metadata-oblivious full-text search we mostly have today, encountered the other day at the Balisage 2009 conference in Montréal.
Try to find a video of the song “I don’t want to go to Toronto”, by a group called Radio Free Vestibule.
When I search video.google.com for “I don’t want to go to Toronto”, I get, in first place, a song called “I don’t want to go”, performed live in Toronto. When I put quotation marks around the title, it tells me nothing matches and shows me a video of Elvis Costello singing “I don’t want to go to Chelsea”.
It’s always good to have concrete examples, and I always like real ones better than made-up examples. (Real examples do often have a disconcerting habit of bringing in one complication after another and involving more than one problem, which is why good ones are so hard to find. But I don’t see many extraneous complications in this one.)
[25 August 2009]
Data persistence is a crapshoot. Load the dice.
-Dorothea Salo, Equipment and data curation, 7 August 2009 (on preferring widely supported open formats to niche formats and closed formats).
[25 August 2009]
Data persistence is a crapshoot. Load the dice.
-Dorothea Salo, Equipment and data curation, 7 August 2009 (on preferring widely supported open formats to niche formats and closed formats)
[25 August 2009]
Data persistence is a crapshoot. Load the dice.
-Dorothea Salo, Equipment and data curation, 7 August 2009 (on preferring widely supported open formats to niche formats and closed formats)
[30 September 2009]
Last week I participated in the XML Summer School organized by Eleven Informatics at St. Edmund Hall in Oxford. I hope the participants enjoyed it as much as the speakers did. The weather certainly cooperated, although it felt more autumnal than summery by the end of the week.
One of my responsibilities during the week was to give a survey of open-source software for XML applications; this turns out to be harder than it might look because there are so many, with such varying degrees of polish, reliability, and completeness. There are several lists of XML software, and open-source software, and open-source XML software (general, or in some specific categories) on the Web, but many of them appear to not to have been maintained or updated in several years. (Honorable exceptions include the lists maintained by Ron Bourret on databases and XML, Lars Marius Garshol on XML tools and Topic-Map tools, and Tony Graham on XSLT testing tools.) So the lists I made, arbitrary and capricious though some aspects of them are, may be helpful.
Eventually I plan to turn the information gathered into a more convenient form, and set up some infrastructure to make it easier to maintain, but in the meantime the slides I prepared for the session may be helpful; they provide a coarsely categorized and tersely annotated list of some open-source XML software that readers of this klog may find interesting.
It’s kind of misleading to call XOM a parser; it’s a pure object model. It can use a parser to build a model, but you can also construct models by direct invocation of the API.
This wiki of tools is fairly well-maintained by the community:
http://wiki.tei-c.org/index.php/Category:Tools