What is XSDL 1.1, and why should you care?
C. M. Sperberg-McQueen
Overview
- What makes the Web special?
- What is W3C?
- What is XSDL?
- What's new in XSDL 1.1?
What makes the Web special?
What makes the Web special?
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?
- first off the block? NO.
- improved reliability? NO.
- innovation? NO.
- simplicity.
The Web succeeded because it was simple.
Why was it made simple? In order to scale.
What is W3C?
What is W3C and how is it special?
The World Wide Web Consortium
Our mission: to lead the Web to its full potential.
Our goals:
- universal access
- semantic Web
- trust
- interoperability
- evolvability (through simplicity, modularity, compatibility, extensibility)
- decentralization
- cooler multimedia
Some consequences
Because of our goals, we care about
- scalability
- consensus-based standardization
- vendor neutrality
- openness (both technical and social)
- freedom to implement W3C specifications
- clarity concerning intellectual property rights
W3C organization
A world-wide organization.
- host organizations
- ERCIM (European Research Consortium for Informatics and Mathematics), Sophia-Antipolis
- Keio University, Tokyo
- Massachusetts Institute of Technology, Cambridge, Mass.
- offices in many regions: Australia, Brazil, China, Benelux, Finland,
Germany/Austria, Greece, Hungary, India, Israel, Italy,
Korea, Morocco, Southern Africa, Spain, Sweden, United Kingdom and Ireland.
- outreach, publicity, workshops, etc.
- translations
- members from Europe, North America, and Asia
We need your help
As vendors, support open standards and grow the marketplace!
As customers, demand that your vendors support open standards!
As users, read and comment on our draft specifications!
As institutions, join the W3C to give users a stronger voice
within the organization!
What is XSDL?
Concretely, a mechanism (an XML vocabulary)
for defining XML vocabularies.
More generally,
- The Iron Law of software
- How to draw a line in the sand
The Iron Law
The fundamental rule of all information technology:
Garbage in, garbage out.
What are the consequences?
Perfect software
Perfect software
- accepts any input
- calculates the correct answer for any input
- never fails
- never loops
- does not exist
Programs and sets
Programs define many sets:
- inputs accepted
- inputs rejected
- inputs processed correctly
- inputs processed incorrectly
- inputs on which program terminates normally
- inputs on which program blows up
- inputs for which program never terminates
Program input as set
The set of inputs for which a program terminates is a set.
Good and bad terminations
But not all terminations are happy ones.
Managing bad terminations
Eliminate undesirable sets.
The usual goal
Accepting only what we process correctly, rejecting all other input.
To achieve this, we need a cheap* way of testing set membership.
How to draw a line in the sand
- validation as filter
- filters define sets
Validation as filter
Separating the valid from invalid.
A schema defines a set
Schemas define sets.
Judgment day
Sheep on the right, goats on the left.
A schema defines a set
Schemas define sets.
They can be used to draw a line in the sand:
- valid on one side (will be processed)
- invalid on the other (will be bounced)
Validity and program correctness
Relation of validity to acceptance
- Accept only valid input?
- Accept all valid inputs?
- Accept some valid inputs (but not all, and not only)?
- Accept all and only valid input?
A situation to avoid
The worst of both worlds.
The usual approach
Accept all and only valid messages.
Validity and program input
The usual approach:
- accept all valid documents
- reject all invalid documents
(or at least most)
Unfortunately, this is not always the right way to do it.
Partial validity
You do not have to reject all invalid documents!
You can accept partially valid documents.
Drawback: complexity.
Drawback: may encourage dirty data. (The dark side of Postel's Law.)
Accepting all valid inputs
One schema defines one set (= one circle).
The diagram has two circles.
What now?
What's new in XSDL 1.1?
- more powerful all-groups
- assertions
- conditional type assignment
- changes to wildcards
- open content
- versioning of XSDL
More powerful all-groups
- wildcards now allowed
- maxOccurs may be > 1
- may be extended
- extension of all-group is all-group
Assertions
- must be true, for the instance to be valid
- expressed using XPath 2.0
- can point down, but not up
- like assert rules in Schematron
- like CHECK clauses in SQL
E.g. “Every element of type haystack
must contain (somewhere) at most one needle element.”
E.g. “The value of the total
element must be the sum of the values of the item elements.”
(A check on intentional redundancy.)
Conditional type assignment
- decides which of several possible types to assign
- expressed using XPath 2.0
- can point at attributes, but not down, and not up
E.g.
“If the attribute@message-type is ‘text’,
then
the element gets the type my:text-type;
else if the attribute@message-type is ‘html’,
then
the element gets the type my:html-type;
else if the attribute@message-type is ‘xhtml’,
then
the element gets the type my:xhtml-type;
else
the element gets the type my:text-type.”
Wildcard changes
- easier to use wildcards
- fewer conflicts with elements (‘weakened wildcards’)
- automatic wildcards
- negative wildcards
- not-in-schema wildcards
Versions
Versions of a vocabulary will differ
- just as different models may
- because we learn things
What happens with V2?
V2 includes V1 (forward compatibility).
Room to grow
If a V1 program could only be induced to accept V2 data ...
Room to grow
Solution: induce V1 programs to accept V2 data?
Problem: how?
One way: Distinguish
- Messages fully understood by V1 processors
- Messages tolerated by V1 processors
Cf. HTML rule: “Ignore what you don't understand.”
This works for well-defined processing semantics.
Open content makes it easy.
Open content
Automagic wildcards inserted into content model
Details of the wildcard under user control
- elements in this namespace
- elements in other namespaces
- elements not declared in this schema
- ...
Multiple versions of XSDL 1.1
Conditional inclusion allows schemas to:
- use new constructs if the processor supports them
- fall back to other definitions if the processor does not
“If the validator understands version 1.1,
then [define using open content and assertions],
else [define using XSDL 1.0].”
Conclusion
XSDL 1.1 makes it easier to make vocabularies easier to version.
Open content comes closer to redeeming the hopes many had for XML in the first place.
Thank you. Questions?