Overview |
Definition |
In words, the set of strings t which can follow s in sentences of R.Given a set R of sequences and some finite sequence s, the derivative of R with respect to s is denoted by DsR and is DsR = {t | st ∈ R}.
Brzozowski's contributions |
What use? |
Related work |
Preliminaries |
Content model notation |
XML Schema equivalents |
| RE | XSD 1.0 | |
|---|---|---|
| empty sequence | ε | <xsd:sequence/> |
| empty set | ∅ | <xsd:choice/> |
| child element | expanded name | <xsd:element ref = "QName"/>
<xsd:element name = "NCName"> ... </xsd:element> |
| wildcard | wc(KW, NS) | <xsd:any namespace = "NS" processContents = "KW"/> |
| neg. wildcard | wc(KW, not(NS)) | <xsd:any namespace = "##other" processContents = "KW"/> |
XML Schema equivalents (2) |
| RE | XSD 1.0 | |
|---|---|---|
| disjunction | F | G | <xsd:choice> F G </xsd:choice> |
| concatenation | F, G | <xsd:sequence> F G </xsd:sequence> |
| all-group | F & G | <xsd:all> F G </xsd:all> |
| repetition | F{n, m} | <F minOccurs="n" maxOccurs="m" /> |
Nullability |
| E | ν(E) |
|---|---|
| ε | true |
| ∅ | false |
| expanded name / QName | false |
| wc(KW, NS), wc(KW, not(NS)) | false |
| F | G | ν(F) ∨ ν(G) |
| F, G | ν(F) ∧ ν(G) |
| F & G | ν(F) ∧ ν(G) |
| F{n, m} | n = 0 ∨ ν(F) |
Constructing derivatives |
Derivatives of ε and ∅ |
| E | DxE |
|---|---|
| ε | ∅ |
| E | DxE |
|---|---|
| ∅ | ∅ |
Derivatives of QNames |
| E | DxE |
|---|---|
| expanded name / QName e | if x = e, then ε, else ∅ |
Derivatives of wildcards |
| E | DxE |
|---|---|
| wc(KW, NS) | if x ∈ namespace N ∈ NS, then ε, else ∅ |
| wc(KW, not(NS)) | if x ∈ namespace N ∈ NS, then ∅, else ε |
Derivatives of choices |
| E | DxE |
|---|---|
| F | G | (DxF) | (DxG) |
Derivatives of sequences |
| E | DxE |
|---|---|
| F, G | if ν(F)
then ((DxF), G) | (DxG), else ((DxF), G) |
Sequence examples |
Derivatives of all-groups |
| E | DxE |
|---|---|
| F & G | ((DxF) & G) | (F & (DxG)) |
Derivatives of repetitions |
| E | DxE |
|---|---|
| F{n, m} | if n = 0 ∧ m = unbounded, then
(DxF), F{0, unbounded}
(DxF), F{--n, --m}
(DxF), F{--n, --m}
| DxF{--n, --m}
|
Repetition examples |
A nullable repetition example |
Nullable repetition (conclusion) |
Simplifying expressions |
Characteristic derivatives |
Finding all derivatives |
Regex to FSA |
FSA example |
Schema processing |
Validation |
Validation example |
EDI example |
Checking determinism |
Initial determinism |
| E | c(x,E) |
|---|---|
| ε | 0 |
| ∅ | 0 |
| expanded name e | if x = e then 1 else 0 |
| wc(KW, NS) | if x ∈ N ∈ NS then 1 else 0 |
| wc(KW, not(NS)) | if x ∈ N ∈ NS then 0 else 1 |
| F | G | c(x, F) + c(x, G) |
| F, G | if ν(F) then c(x, F) + c(x, G), else c(x, F) |
| F & G | c(x, F) + c(x, G) |
| F{n, m} | c(x, F) |
Determinism reformulated |
For all symbols x ∈ Σ,
and for all characteristic derivatives F of the original content model,
c(x, F) ≤ 1.
Simplified determinism check |
Determinism example |
Cost of determinism check |
Weakened wildcards |
For all symbols x ∈ Σ, and for all characteristic derivatives F of the original content model,and
- c(wildcards, x, F) ≤ 1
- c(elements, x, F) ≤ 1
all-groups |
all-group examples |
Content model subsumption |
Checking subsumption |
The subsumption problem |
Solution |