Many readers associate the use of variables with mathematics and feel threatened by paragraphs that begin “Let E be … and F be …. Then …” And similarly with technical terms: when a text defines and uses a lot of technical terms, it can be very daunting to the first-time reader (and many others).
So it’s understandable that sometimes, in trying to keep a text accessible to the reader, one works hard to avoid having to introduce variables to refer to things, and to avoid relying on technical terms with special meanings.
But sometimes such efforts backfire. In the XSD (XML Schema Definition Language) 1.0 spec, you end up with rules that read like this:
Validation Rule: Element Locally Valid (Type)For an element information item to be locally ·valid· with respect to a type definition all of the following must be true:1 The type definition must not be ·absent·;2 It must not have {abstract} with value true.3 The appropriate case among the following must be true:3.1 If the type definition is a simple type definition, then all of the following must be true:3.1.1 The element information item’s [attributes] must be empty, excepting those whose [namespace name] is identical tohttp://www.w3.org/2001/XMLSchema-instance
and whose [local name] is one oftype
,nil
,schemaLocation
ornoNamespaceSchemaLocation
.3.1.2 The element information item must have no element information item [children].3.1.3 If clause 3.2 of Element Locally Valid (Element) (§3.3.4) did not apply, then the ·normalized value· must be ·valid· with respect to the type definition as defined by String Valid (§3.14.4).3.2 If the type definition is a complex type definition, then the element information item must be ·valid· with respect to the type definition as per Element Locally Valid (Complex Type) (§3.4.4);
I would say “Maybe it’s just me, but I find that kind of hard to read,” but that would be disingenuous. There is ample evidence from the last eight or nine years that I am not the only reader of the XSD 1.0 spec who finds parts of it hard to read. This is a relatively mild example, as the XSD spec goes. But if we can overcome our fear of formality, the text can become a bit simpler. Two changes in particular seem useful here.
- Introduce the names E for the element and T for the type, and use them.
- Follow the example of most specs that define and use namespaces: specify and use a conventional prefix to represent a given namespace, and say once and for all, when that prefix is identified, that in practice the user can use any prefix they wish (or none). Then just use the QNames, rather than writing out the namespace in full each time you have to talk about names in that namespace.
Applying these rules to the fragment just given, we get something a bit easier to read.
Validation Rule: Element Locally Valid (Type)
For an element information item E to be locally ·valid· with respect to a type definition T all of the following must be true:1 T is not ·absent·;2 T does not have {abstract} with value true.3 The appropriate case among the following is true:3.1 If T is a simple type definition, then all of the following are true:3.1.1 E‘s [attributes] are empty, excepting those namedxsi:type
,xsi:nil
,xsi:schemaLocation
, orxsi:noNamespaceSchemaLocation
.3.1.2 E has no element information item [children].3.1.3 If clause 3.2 of Element Locally Valid (Element) (§3.3.4.3) did not apply, then the ·normalized value· is ·valid· with respect to T as defined by String Valid (§3.16.4).3.2 If T is a complex type definition, then E is ·valid· with respect to T as per Element Locally Valid (Complex Type) (§3.4.4.2);4 If E has anxsi:type
[attribute] and does not have a ·governing element declaration·, then the ·actual value· ofxsi:type
·resolves· to T.
I won’t claim that the text has become easy to read and follow, but I think there is one salient difference: in the first text above, my first difficulty as a reader is understanding what the text is trying to say, and once I have figured that out, I may or may not have energy left to try to understand why it’s saying that. In the second text, it’s easier (I think) to understand what the individual clauses are saying. The reader still has the task of understanding why, but at least the difficulties of comprehension are now those related to the intrinsic difficulty of the topic, without the additional barrier of complex syntax.
Another tactic adopted by some in trying to make difficult material easier to read is to avoid defining technical terms. The XSD 1.0 spec raises this to a fine art; often, the easiest way to understand how a given rule came to be formulated as it is, is to imagine that it was first written in a simple, straightforward clause using technical terms, and then the technical terms were eliminated and their definitions inserted inline. And then the process was repeated once, or twice, or more. The result is mostly devoid of difficult or obscure technical usages, but it’s often also a sentence only an eighth-grade English teacher teaching the unit on sentence diagramming could love.
If we re-introduce appropriate technical terms, this process can be reversed. Sometimes the introduction of even a single technical term can do a surprising amount of good.
Take the following example from the XSD spec:
2.3.1 The element declaration is local (i.e. its {scope} must not be global), its {abstract} is false, the element information item’s [namespace name] is identical to the element declaration’s {target namespace} (where an ·absent· {target namespace} is taken to be identical to a [namespace name] with no value) and the element information item’s [local name] matches the element declaration’s {name}. In this case the element declaration is the ·context-determined declaration· for the element information item with respect to Schema-Validity Assessment (Element) (§3.3.4) and Assessment Outcome (Element) (§3.3.5).
This is followed by another clause with almost identical wording, covering global elements.
If we make use of the term expanded names, defined by the Namespaces in XML recommendation, and refer to the expanded names of the declaration and element instead of inlining the definition of expanded name by referring to namespace name + local name pairs — this entails defining the term expanded name as it applies to schema components — and supply the obvious variable names for element and declaration, then it’s easier to see that this rule for local element declarations can be merged with the following rule for global element declarations, since the two do exactly the same thing. So we can replace both the rule above and the the rule that follows it in the spec with:
2.3.1 D has the same expanded name as E.In this case D is the ·context-determined declaration· for E with respect to Schema-Validity Assessment (Element) (§3.3.4.6) and Assessment Outcome (Element) (§3.3.5.1).
If I’m smiling this evening, it’s because this morning the XML Schema working group agreed to these changes, and scores of other similar changes, to the text of the XSD 1.1 spec. The design of the language, I admit, is still very complex. The exposition, I concede, still has a sub-optimal structure. But the third source of difficulty, namely the complexity of individual sentences in the validation rules and contraints on schema components, is somewhat diminished by this change.
Variable names as a short-hand for complex noun phrases; technical terms to capture frequently needed concepts; conventions to allow things to be said simply instead of in convoluted clauses: it’s almost enough to make you think that mathematical writing is the way it is, in order to make things easier to read, instead of harder to read. Food for thought.