This document is a working draft.
If you are not actively collaborating with me on the project described here, then you probably stumbled across this by accident. Please do not quote publicly or point other people to this document. The URI at which this document now exists is very unlikely to remain permanently available.
(n : node)(d : document)(s : string)(P(n,d,s))instead of (the equivalent):
(n)(d)(s) (node(n) & document(d) & string(s) → P(n,d,s))In other words, we allow ourselves to assume a relatively simple type system and suitable type predicates.
... f(x) ...but
... f(x,y) ... y ...
<text/ <doc/ <code-version/1.01> <catno/171> <ti/Farbenlehre> <crncopy/70> ... /doc> ... /text>(For now, we ignore the fact that MECS documents aren't necessarily trees. This one is a tree, in the bits we care about for now.)
node(n00). node(n01). node(n02). node(n03). node(n04). node(n05). node(n06). node(n07). node(n08). node(n09).
gi(n00,text). gi(n01,doc). gi(n02,code-version). gi(n03,catno). gi(n04,ti). gi(n05,crncopy). gi(n06,'#pcdata'). gi(n07,'#pcdata'). gi(n08,'#pcdata'). gi(n09,'#pcdata').
parent(n01,n00). parent(n02,n01). parent(n03,n01). parent(n04,n01). parent(n05,n01). parent(n06,n02). parent(n07,n03). parent(n08,n04). parent(n09,n05).
first-child(n00,n01). first-child(n01,n02). first-child(n02,n06). first-child(n03,n07). first-child(n04,n08). first-child(n05,n09).
nsib(n02,n03). nsib(n03,n04). nsib(n04,n05).
content(n06,'1.01'). content(n07,'171'). content(n08,'Farbenlehre'). content(n09,'70').
?- content(n00,X). No
?- parent(X,n00). X = n01 ; Noor
?- findall(X,parent(X,n00),List). X = _G157 List = [n01] Yes
Documentation(n0,n1)so the axiom we are looking for needs to have that clause as the consequent of some implication.
(n : node)(m : node) (gi(n,doc) & parent(n,m) → Documentation(m,n))CH's form was fully bound (in order to evade having to specify quantifiers):
gi(n1,doc) & parent(n1,n0) → Documentation(n0,n1)
if (validate(doc) == VALID) { call_blithe_engine(doc); } else { printf("Huh?\n"); }
(for all s : mecswit-stream) (well-behaved(s) → (∃ n : node)(in-stream(n,s) & gi(n,doc) & (for all p : node)(gi(p,doc) & in-stream(p,s) → p = n) & parent(n,s)))
(for all p) doc(p) → (∃ q)(ancestor(q,p) & Documentation(p,q) & Stream(q))In English: for any p, if p is a doc element, then there exists a q which is an ancestor of p, and q is a data stream, and p is the documentation for q. (Or, less formally: p is the documentation for the data stream p is in.)
(for all n : node)(for all m : node) (gi(n,doc) & gi(m,text) & parent(n,m) & (for all p : node)(gi(p,doc) & descend_anc(p,m) → p = n) → Documentation(n,m))In English: for any nodes n and m, if n is a doc element which is a child of the text element m, and m has no other doc elements anywhere, then n is the documentation for m (less formally: n is the documentation for the document it's in).
(for all n : node) (for all s : string) (for all s2 : stream) (gi(n,catno) & content(n,s) & in(n,s2) → (∃ i : item) (has-vw(i,s) & transcribes(s2,i)))
(for all n : node) (for all m : node) (for all s : string) (gi(n,catno) & content(n,s1) & root(n,s) → (∃ i : item) (has-vw(i,s) & transcribes(m,i)))
doc(n,catno) & content(n,'171') & root(n,m) → transcription(o,n) & item(o) & vwno(o,'171')
hasvwno(i : item, s : string) iff (∃ e : entry) (e in vwcatalog & entry_id(e, s) & entry_item(e,i) )N.B. CH doesn't find this particularly compelling. But we don't actually need to do this.
(for all n, m : node)(for all s: string) (gi(n,catno) & root(n, m) & content(n, s) → (∃ i : item)(vwno(I,s) & transcribes(m, i) & (for all j : item)(vwno(j,s) → i = j) & (for all j : item)(transcribes(m,j) → i = j)))Claus offers an alternative formfor the first j clause:
& ¬(∃ j : item)(vwno(j,s) & ¬(i = j))
(for all n : node) (for all i : item) (for all s : string) (gi(n, ti) & content(n, s) & elem_item(n, i) → item_title(i, s))
element_item(n, i) iff (∃ m : node)(root(n, m) & transcribes(m, i))CH frowns a little over the auxiliary relation.
(for all n : node) (for all s : string) (for all i : item) (gi(n, crncopy) & content(n, s) & element_item(n, i) → has_crn_no(i, s)) has_crn_no(i, s) iff (∃ c)(cornellvol(c) & reproduces(c, i) & volume_no(c, s))
gi(n,'crncopy') & element_item(n, i) & content(n, s) → (∃ j)(copy(j,i) & crnno(j, n))
(for all n : node) (for all i : item) (for all s : string) (gi(n, wabcopy) & element_item(n, i) & content(n, s) → (∃ j : copy)(copies(j, i) & nldescribes(s, j)))
(for all n : node) (for all i : item) (for all s : string) (gi(n, datfrom) & element_item(n, i) & contents(n, s) → est_terminus_a_quo(i, s)) (for all n : node) (for all i : item) (for all s : string) (gi(n, datto) & element_item(n, i) & contents(n, s) → est_terminus_ad_quem(i, s))
... → (for all de, di : date)(written_date(i, di) & string_date(s, de) → di ≤ de)i.e. ‘the actual date the item was written is less than or equal to (comes on or before the date of) the estimate’
(for all n : node) (for all i : item) (for all s : string) (for all di,de : date) (gi(n, datfrom) & element_item(n,i) & contents(n,s) & date_lex_val(s,de) & item_date(i, di) → di ≥ de) (for all n : node) (for all i : item) (for all s : string) (for all di,de : date) (gi(n, datto) & element_item(n,i) & contents(n,s) & date_lex_val(s,de) & item_date(i, di) → di ≤ de)MSM's first version of the preceding used a generic predicate for mapping a lexical form to a value given a type name, so instead of date_lex_val(s, de) he had type_lex_val(date,s,de). That felt uncomfortable to Claus (a bit too much like meta-logic?) so we changed it. The name of a type appearing in an argument position made Claus wary.
(for all n : node) (for all s : string) ¬ gi(n, '#pcdata') → (contents(n, s) iff (∃ m : node) (parent(m, n) & first-child(n, m) & ¬(∃ l : node)(nsib(m,l)) & gi(m,'#pcdata') & contents(m,s)))
(for all n, m : node) (for all i : item) (for all s : string) (gi(n,'transcribers') & contents(n,s) & element_item(n, i) & root(n, m) → (∃ x : persons)(transcribes(x, i, m) & nldescribes(s, x)))One might think first of a two-argument transcribes relation.
g(n, transcribers) & contents(n, s) & root(n, m) → (∃ p)(designates(s, p) & transcribers(p, m) & person(p))
(for all n : node) (for all i : item) (for all s : string) (gi(n, sections) & content(n, s) & element_item(n, i) → section_description(i,s))
gi(n, sections) & content(n, s) & root(n, m) → (for all x : node) (gi(x, sec) & contents(x, t) & root(x,m) → describes_recognition_criteria(t,x))
(for all n : node) (for all i : item) (for all s : string) (gi(n, sections) & content(n, s) & element_item(n, i) → (for all x : node) (gi(x, sec) & root(x,m) → describes_recognition_criteria(s,x))
(for all n : node) (for all m : node) (for all s : string) (gi(n, sec) & element_sectionselement(n, m) & gi(m, sections) & contents(m, s) → describes_recognition_criteria(s, n))
(for all n, m, r : node) (for all s : string) (gi(n, sec) & gi(m, sections) & root(n, r) & root(m, r) & contents(m, s) → describes_recognition_criteria(s, n))
suffix ::= resp cancel* | cancel+ resp ::= '_' handid cancel ::= '_c' handid? handid ::= known | unknown known ::= h | S[1-9] unknown ::= H[1-9] | XA suffix or part of a suffix matching the resp production indicates who is responsible for whatever the generic identifier is identifying: an addition, a deletion, a line in the margin, etc. A suffix or part of a suffix matching the cancel production indicates who is responsible for canceling (a) whatever was done by the hand named in the immediately preceding suffix segment, if there is one, or otherwise (b) whatever is indicated by the generic identifier.
(for all n : node) (for all i : item) (for all s : string) (for all h : node) (for all sh : string) (gi(n,added_S1_cS2) & contents(n,s) & element_item(n,i) & xpath(n,'//hands',h) & contents(h,sh) → (for all c : ucschar) (c in s → (∃ g : graph) (g in i & /* location of g ... */ & added(g) & canceled(g) & (∃ p1 : person) (added_by(g, p1) & nldescribes(hs,p1)) & (∃ p2 : person) (canceled_by(g,p2) & nldescribes(hs,p2)))))where xpath/3 takes a node set as a first argument, an XPath expression as a second, and returns a node set as its third argument (we're here assuming that singleton node sets can be dealt with in the same way as single nodes).
<s>John <us1>loves</us1> Mary.</s>
node(n0001). node(n0001). node(n0003). node(n0004). node(n0005). travord(n0001,1). travord(n0002,2). travord(n0003,3). travord(n0004,4). travord(n0005,5). gi(n0001,s). gi(n0003,us1). gi(n0002,'#pcdata'). gi(n0004,'#pcdata'). gi(n0005,'#pcdata'). first_child(n0001,n0002). first_child(n0003,n0004). nsib(n0002,n0003). nsib(n0003,n0005). parent(n0002,n0001). parent(n0003,n0001). parent(n0004,n0003). parent(n0005,n0001). content(n0002,"John "). content(n0004,"loves"). content(n0005," Mary.").
graph(g1). graph(g2). ... graph(g16).Also, information on what grapheme each graph is assigned to:
grapheme(g1,'J'). grapheme(g2,'o'). ... grapheme(g16'.').And finally we want to capture the order relation within the PCDATA chunks:
gnext(g1,g2). gnext(g2,g3). gnext(g3,g4). gnext(g4,g5). gnext(g6,g7). gnext(g7,g8). gnext(g8,g9). gnext(g9,g10). gnext(g11,g12). gnext(g12,g13). gnext(g13,g14). gnext(g14,g15). gnext(g15,g16).
(for all n, p : node) (for all i : item) (for all s : string) (for all gid : NAME) (for all c : ucs-character) (for all j : positive-integer) (for all eme : grapheme) (gi(n,gid) & transcription_element(gid) & element_item(n, i) & parent(p, n) & gi(p, '#pcdata') & contents(p, s) & position(s, j, c) & ucschar_grapheme(c, eme) → (∃ g : graph) (graph(g) & graph_in_item(g,i) & grapheme(g, eme) & models(c, g) & (for all g2 : graph) (models(c, g2) → g = g2)))In English:
(for all p : node) (for all s : string) (for all c1, c2 : ucs-character) (for all j, k : positive-integer) (for all g1, g2 : graph) (gi(p, '#pcdata') & contents(p, s) & position(s, j, c1) & position(s, k, c2) & models(c1, g1) & models(c2, g2) & j + 1 = k → gnext(g1,g2)In English: