[7 April 2010]

The old Paul Simon song “There must be fifty ways to leave your lover” keeps running through my head. I can see close to fifty ways to define the XPath 1.0 data model in terms of (a) a set of nodes and (b) two relations defined on that set, visit this site which are taken as primitive; all other relations (i.e. all the other axes of XPath) are defined in terms of those two primitive relations.

Strictly speaking, I make it forty-eight ways. First, pick any single relation from any of the following four groups:

- parent, child, ancestor, descendant
- prevsib, nextsib, preceding-sibling, following-sibling
- prevnode, nextnode, document-order preceding (>>), document-order-following (<<)

That’s twelve possibilities.

Second, pick any single relation from either of the other two groups; that makes eight possible choices, times twelve first choices, for ninety-six ordered pairs of relations. But the order doesn’t matter, so we have forty-eight distinct pairs.

In recent days, taking some pairs not quite at random, defining the constraints they must satisfy in order to be a suitable basis for defining an XPath 1.0 tree, and defining all the other relations in terms of the chosen primitives, I have learned a couple of mildly interesting things.

- It’s more convenient to take parent as a primitive, than child.
- It’s more convenient to take one of the single-step relations (parent, child, nextsib, prevsib, nextnode, prevnode) than one of their transitive closures (ancestor, descendant, etc.).

The nextsib relation, for example, needs to be acyclic, functional, injective, and not transitive. If it is, then its transitive closure following-sibling will automatically be suitable. But if you start with following-sibling and specify (as you will need to) that it is transitive and acyclic, that does not suffice to guarantee that its transitive reduction nextsib is functional and injective. You can of course simply say that a following-sibling relation is suitable if and only if (a) it’s transitive, (b) it’s acyclic, and (c) its transitive reduction is functional and injective, but now you’re forcing the reader to work with two relations, not just one: both following-sibling and its transitive reduction. It would be interesting either to find a way to constrain the closure directly to ensure the necessary properties in the reduction, or else to find a proof that there is no way to constrain a closure to ensure that its reduction is functional and injective without explicitly referring to the reduction.

- From any relation in any group, the other relations in that group are (relatively) easy to derive in terms of inversion, transitive closure, or transitive reduction. Defining a relation in the third group typically proves more interesting. And while it’s more convenient to choose the primitive relations from among the reductions, it turns out that at least in some cases it’s easiest to define the third group in terrms of one of the closures. For example, given the parent and next-sibling relations, it proves easier to define document-order-following in terms of the primitives than to define next-node.

It occurs to me to wonder whether there are ways to define XPath 1.0 trees that don’t reduce to or include one of these forty-eight.