graphic with four colored squares
Cover page image (keys)

What is XSDL 1.1, and why should you care?

C. M. Sperberg-McQueen

Overview

What makes the Web special?

What makes the Web special?

Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?
The Web succeeded because it was simple.
Why was it made simple? In order to scale.

What is W3C?

What is W3C and how is it special?

The World Wide Web Consortium

Our mission: to lead the Web to its full potential.
Our goals:
  • universal access
  • semantic Web
  • trust
  • interoperability
  • evolvability (through simplicity, modularity, compatibility, extensibility)
  • decentralization
  • cooler multimedia

Some consequences

Because of our goals, we care about

W3C organization

A world-wide organization.
  • host organizations
    • ERCIM (European Research Consortium for Informatics and Mathematics), Sophia-Antipolis
    • Keio University, Tokyo
    • Massachusetts Institute of Technology, Cambridge, Mass.
  • offices in many regions: Australia, Brazil, China, Benelux, Finland, Germany/Austria, Greece, Hungary, India, Israel, Italy, Korea, Morocco, Southern Africa, Spain, Sweden, United Kingdom and Ireland.
    • outreach, publicity, workshops, etc.
    • translations
  • members from Europe, North America, and Asia

We need your help

As vendors, support open standards and grow the marketplace!
As customers, demand that your vendors support open standards!
As users, read and comment on our draft specifications!
As institutions, join the W3C to give users a stronger voice within the organization!

What is XSDL?

Concretely, a mechanism (an XML vocabulary)
for defining XML vocabularies.
More generally,

The Iron Law

The fundamental rule of all information technology:
Garbage in, garbage out.

What are the consequences?

Perfect software

Perfect software
  • accepts any input
  • calculates the correct answer for any input
  • never fails
  • never loops
  • does not exist

Programs and sets

Programs define many sets:
  • inputs accepted
  • inputs rejected
  • inputs processed correctly
  • inputs processed incorrectly
  • inputs on which program terminates normally
  • inputs on which program blows up
  • inputs for which program never terminates

Program input as set

The set of inputs for which a program terminates is a set.

Good and bad terminations

But not all terminations are happy ones.

Managing bad terminations

Eliminate undesirable sets.

The usual goal

Accepting only what we process correctly, rejecting all other input.
To achieve this, we need a cheap* way of testing set membership.

How to draw a line in the sand

Validation as filter

Separating the valid from invalid.

A schema defines a set

Schemas define sets.

Judgment day

Sheep on the right, goats on the left.

A schema defines a set

Schemas define sets.
They can be used to draw a line in the sand:

Validity and program correctness

Relation of validity to acceptance

Accept only valid input?

A situation to avoid

The worst of both worlds.

The usual approach

Accept all and only valid messages.

Validity and program input

The usual approach:
  • accept all valid documents
  • reject all invalid documents (or at least most)
Unfortunately, this is not always the right way to do it.

Partial validity

You do not have to reject all invalid documents!
You can accept partially valid documents.
Drawback: complexity.
Drawback: may encourage dirty data. (The dark side of Postel's Law.)

Accept all valid inputs?

Accepting all valid inputs

One schema defines one set (= one circle).

The diagram has two circles.

What now?

What's new in XSDL 1.1?

More powerful all-groups

Assertions

E.g. “Every element of type haystack must contain (somewhere) at most one needle element.”
E.g. “The value of the total element must be the sum of the values of the item elements.” (A check on intentional redundancy.)

Conditional type assignment

E.g. “If the attribute@message-type is ‘text’,
then the element gets the type my:text-type;
else if the attribute@message-type is ‘html’,
then the element gets the type my:html-type;
else if the attribute@message-type is ‘xhtml’,
then the element gets the type my:xhtml-type;
else the element gets the type my:text-type.”

Wildcard changes

Versions

Versions of a vocabulary will differ
  • just as different models may
  • because we learn things

What happens with V2?

V2 includes V1 (forward compatibility).

Room to grow

If a V1 program could only be induced to accept V2 data ...

Room to grow

Solution: induce V1 programs to accept V2 data?
Problem: how?
One way: Distinguish
  • Messages fully understood by V1 processors
  • Messages tolerated by V1 processors
Cf. HTML rule: “Ignore what you don't understand.”
This works for well-defined processing semantics.
Open content makes it easy.

Open content

Automagic wildcards inserted into content model
  • at the end
  • everywhere
Details of the wildcard under user control
  • elements in this namespace
  • elements in other namespaces
  • elements not declared in this schema
  • ...

Multiple versions of XSDL 1.1

Conditional inclusion allows schemas to:
If the validator understands version 1.1,
then [define using open content and assertions],
else [define using XSDL 1.0].”

Conclusion

XSDL 1.1 makes it easier to make vocabularies easier to version.
Open content comes closer to redeeming the hopes many had for XML in the first place.
Thank you. Questions?