Skeletons in the closet

Saying what markup means

C. M. Sperberg-McQueen

Allen Renear

Claus Huitfeld

David Dubin

20 June 2002



This document contains informal, incomplete, and half-digested notes on TEI Lite and other DTDs, made as I was trying to formulate some skeleton sentences for them.

1. System overview

2. An XML document

2.1. The document instance

<!DOCTYPE minutes SYSTEM "minutes.dtd" >
<minutes lang="en">
<revisionHistory>
<rev>
<date>2002-03-21</date>
<person>Paul</person>
<what>drafted minutes during meeting</what>
</rev>
<rev>
<date isoform="2002-03-31">31 Mar</date>
<person>Paul</person>
<what>put draft minutes on server</what>
</rev>
<rev>
<date isoform="2002-03-21" lang="la">idem</date>
<person>Paul</person>
<what>found and fixed three typos, supply
isoform attributes for some dates, put fresh copy on server</what>
</rev>
</revisionHistory>
<present>
<person>John</person>
<person>Paul</person>
</present>
<absent>
<person>Pius</person>
</absent>
<date isoform="2002-03-21">21 March 2002</date>
<body>
<p><person>John</person> expressed concern about <person>Pius</person>'s
attendance record.
<resolved>to increase the fine for missed meetings to a nickel.</resolved></p>
<p>We discussed the date and place of the next meeting.</p>
<p>Agreed: <b>Next meeting is <date>28 March</date>, usual place</b>.
<action>
<person>Paul</person>
<what>to order refreshments</what>
<when>26 March</when>
</action>
(N.B. The kitchen is, <i lang="la">mirabile dictu</i>,
demanding two days' notice for refreshments now.)</p>
</body>
</minutes>

2.2. The DTD

Note that some element types occur in different contexts with different functions.
<!ENTITY % phrases "(#PCDATA |date | i | b | person | action | resolved)*" >
<!ENTITY % a.global "lang NMTOKEN #IMPLIED">

<!ELEMENT minutes (revisionHistory?, present, absent, date, body) >
<!ATTLIST minutes  %a.global; >

<!ELEMENT revisionHistory (rev*) >
<!ATTLIST revisionHistory  %a.global; >
<!ELEMENT rev (date, person, what) >
<!ATTLIST rev  %a.global; >

<!ELEMENT present (person*) >
<!ATTLIST present  %a.global; >
<!ELEMENT absent (person*) >
<!ATTLIST absent %a.global; >
<!ELEMENT person (#PCDATA) >
<!ATTLIST person %a.global; >
<!ELEMENT date (#PCDATA) >
<!ATTLIST date %a.global;
          isoform CDATA #IMPLIED>
<!ELEMENT body (p | action | resolved | div)* >
<!ATTLIST body %a.global; >
<!ELEMENT div (head, (p | action | resolved | div)*) >
<!ATTLIST div %a.global; >
<!ELEMENT action (person, what, when) >
<!ATTLIST action %a.global; >
<!ELEMENT resolved %phrases; >
<!ATTLIST resolved %a.global; >
<!ELEMENT what (#PCDATA) >
<!ATTLIST what %a.global; >
<!ELEMENT when (#PCDATA |date)* >
<!ATTLIST when %a.global; >
<!ELEMENT head (#PCDATA | date | i | b | person)* >
<!ATTLIST head %a.global;
<!ELEMENT p %phrases; >
<!ATTLIST p %a.global; >
<!ELEMENT i %phrases; >
<!ATTLIST i %a.global; >
<!ELEMENT b %phrases; >
<!ATTLIST b %a.global; >

3. Image sentences

3.1. Nodes and traversal order

node(n1).
node(n2).
...
node(n99).
travord(n1,1).
...
travord(n99,99).

3.2. Generic identifiers

[N.B. many #pcdata nodes omitted here]
gi(n01,minutes).
gi(n02,'#pcdata').
gi(n03,revisionHistory).
gi(n04,'#pcdata').
gi(n05,rev).
gi(n07,date).
gi(n10,person).
gi(n13,what).
gi(n17,rev).
gi(n19,date).
gi(n22,person).
gi(n25,what).
gi(n29,rev).
gi(n31,date).
gi(n34,person).
gi(n37,what).
gi(n42,present).
gi(n44,person).
gi(n47,person).
gi(n51,absent).
gi(n53,person).
gi(n57,date).
gi(n60,body).
gi(n62,p).
gi(n63,person).
gi(n66,person).
gi(n69,resolved).
gi(n72,p).
gi(n75,p).
gi(n77,b).
gi(n79,date).
gi(n83,action).
gi(n85,person).
gi(n88,what).
gi(n91,when).
gi(n95,i).

3.3. Attributes and attribute values

attv(n01,lang,'en').
attv(n19,isoform,'2002-03-31').
attv(n31,isoform,'2002-03-21').
attv(n31,lang,'la').
attv(n57,isoform,'2002-03-21').
attv(n95,lang,'la').

3.4. Text (character data) content

[Many nodes with whitespace-only content omitted here]
content(n02,"\n").
content(n04,"\n").
content(n08,"2002-03-21").
content(n11,"Paul").
content(n14,"drafted minutes during meeting").
content(n20,"31 Mar").
content(n23,"Paul").
content(n26,"put draft minutes on server").
content(n32,"idem").
content(n35,"Paul").
content(n38,"found and fixed three typos, supply\nisoform attributes for some dates, put fresh copy on server").
content(n45,"John").
content(n48,"Paul").
content(n54,"Pius").
content(n58,"21 March 2002").
content(n64,"John").
content(n65," expressed concern about ").
content(n67,"Pius").
content(n68,"'s\nattendance record.\n").
content(n70,"to increase the fine for missed meetings to a nickel.").
content(n73,"We discussed the date and place of the next meeting.").
content(n76,"Agreed: ").
content(n78,"Next meeting is ").
content(n80,"28 March").
content(n81,", usual place").
content(n82,".\n").
content(n86,"Paul").
content(n89,"to order refreshments").
content(n92,"26 March").
content(n94,"\n(N.B. The kitchen is, ").
content(n96,"mirabile dictu").
content(n97,",\ndemanding two days' notice for refreshments now.)").

3.5. Tree structure

[Only samples are given.]
first_child(n01,n02).
first_child(n03,n04).
first_child(n05,n06).
first_child(n07,n08).
first_child(n10,n11).
first_child(n13,n14).
first_child(n17,n18).
first_child(n19,n20).
...
nsib(n02,n03).
nsib(n03,n41).
nsib(n04,n05).
nsib(n05,n16).
nsib(n06,n07).
nsib(n07,n09).
nsib(n09,n10).
nsib(n10,n12).
nsib(n12,n13).
nsib(n13,n15).
nsib(n16,n17).
nsib(n17,n28).
nsib(n18,n19).
nsib(n19,n21).
...
parent(n02,n01).
parent(n03,n01).
parent(n04,n03).
parent(n05,n03).
parent(n06,n05).
parent(n07,n05).
parent(n08,n07).
parent(n09,n05).
...
parent(n99,n01).

4. Property axioms and propagation sentences

4.1. Propagation sentences

4.1.1. Kinds of things

The document:
minutes(n01).
Its parts:
revision_list(n03).
present_list(n42).
absent_list(n51).
minutes_body(n60).
date(n57).
[Or alternatively:]
minutes(n1).
present_list(n1,n3). /* n3 is the present list OF n1! */
absent_list(n1,n4).
Items in the revision history:
date(n07).
date(n19).
date(n31).
person_name(n10).
person_name(n22).
person_name(n34).
what(n13).
what(n25).
what(n37).
revision_event(n05).
revision_event(n17).
revision_event(n29).
Items in the present/absent lists:
person_name(n44).
person_name(n47).
person_name(n53).
Items within the body
p(n62).
p(n72).
p(n75).

resolution(n69).
action_item(n83).
person_name(n85).
what(n88).
when(n91).
Phrases in free prose:
person_name(n63).
person_name(n66).
date(n79).
bold_phrase(n77).
italic_phrase(n95).
Note that
Agreed: <b>Next meeting is <date>28 March</date>, usual place</b>.
is not tagged as a resolved element and is not translated into
resolution(...)

4.1.2. Part-of and sequence relations

Some thing things are constituent parts of their parents (and others not). Some things have meaningful sequence (others not); The part_whole and succ relations are subsets of parent and nsib.
part_whole(n05,n03). /* rev is part of revisionHistory */
part_whole(n07,n05). /* date is part of rev */
part_whole(n10,n05). /* person is part of rev */
part_whole(n13,n05). /* what is part of rev */

part_whole(n17,n03). /* rev is part of revisionHistory */
part_whole(n19,n17). /* date is part of rev */
part_whole(n22,n17). /* person is part of rev */
part_whole(n25,n17). /* what is part of rev */

part_whole(n29,n03). /* rev is part of revisionHistory */
part_whole(n31,n29). /* date is part of rev */
part_whole(n34,n29). /* person is part of rev */
part_whole(n37,n29). /* what is part of rev */

part_whole(n42,n01). /* present is part of minutes */
part_whole(n51,n01). /* absent is part of minutes */
part_whole(n57,n01). /* date is part of minutes */
part_whole(n60,n01). /* body is part of minutes */

part_whole(n44,n42). /* person is part of present */
part_whole(n47,n42). /* person is part of present */
part_whole(n53,n51). /* person is part of absent */

part_whole(n62,n60). /* p is part of body */
part_whole(n72,n60). /* p is part of body */
part_whole(n75,n60). /* p is part of body */

part_whole(n85,n83). /* person is part of action */
part_whole(n88,n83). /* what is part of action */
part_whole(n91,n83). /* when is part of action */
/* successor predicate: direct adjacency */
succ(n05,n17). /* the rev elements are ordered */
succ(n17,n29).

succ(n62,n72). /* the p elements within the body are ordered */
succ(n72,n75).

succ(n63,n65). /* the contents of each p element are ordered */
succ(n65,n66).
succ(n66,n68).
succ(n68,n69).
/* paragraph n72 has only one child, no sequencing applies */
succ(n76, n77).
succ(n77, n82).
succ(n82, n83).
succ(n83, n94).
succ(n94, n95).
succ(n95, n97).
N.B. the children of minutes (present, absent, date, and body) are not ordered by the succ predicate; the order of elements in the document carries no meaning.

4.1.3. Sequence rule

[Tentative]
precedes(X,Y) :- succ(X,Y).
precedes(X,Y) :- succ(X,Z), precedes(Z,Y).
precedes(X,Y) :- parent(X,Z), precedes(Z,Y).
precedes(X,Y) :- parent(Y,Z), precedes(X,Z).

4.2. Property axioms

Whenever we have a minutes element, we have a set of minutes for a meeting.
minutes(X) :- node(X), gi(X,minutes).
Whenever we have a present element, we have the list of those present at the meeting. Similarly for absent elements.
present_list(X) :- node(X), gi(X,present).
absent_list(X) :- node(X), gi(X,absent).
When we have a date element, we have a date.
date(X) :- node(X), gi(X,date).
standard_value_of_date(X,Y) :- date(X), attv(X,isoform,Y).

5. Mapping axioms and application sentences

5.1. Application sentences in English

  • There was a meeting on 21 March 2002.
  • The minutes of the meeting are in the document rooted at n01.
  • John and Paul attended.
  • Pius was absent.
  • John is a person. (And Paul. And Pius.)
  • The meeting produced one resolution, namely to increase the fine for absenteeism by a nickel.
  • The meeting produced one action, namely Paul to order refreshments for the next meeting.

5.2. Application sentences in Prolog

One possible Prolog form (not in same order):
meeting(m234).
meeting_date(m234,'2002-03-21').
meeting_minutes(m234,n01).
person(p1,"John").
person(p2,"Paul").
person(p3,"Pius").
meeting_attendees(m234,[p1,p2]).
meeting_absentees(m234,[p3]).
resolution(r33,"to increase the fine for missed meetings to a nickel.").
action(a347,p2,informal-date("26 March"),"to order refreshments").
meeting_resolutions(m234,[r33]).
meeting_actions(m234,[a347]).
Another form:
Declare the relevant classes:
class(document). /* ? minutes will be of type document */
class(minutes).
subclass(minutes,document).
class(meeting). 
Assign objects to classes:
object(o1).
obj_class(doc33,minutes).

object(o2).
obj_class(o2,meeting).
The meeting took place on 21 March. One way to do this:
property_of(meeting,date,string).
opv(o2,date,"2002-03-21").
N.B. in the sample DTD dates have both a string form (the content of the date element) and a normalized ISO-format form, the latter being optional. We might describe dates this way:
class(date).
property_of(date,localform,string).
property_of(date,isoform,string). 
property_of(meeting,date,object(date)).
object(o3).
obj_class(o3,date).
opv(o2,date,o3).
opv(o3,localform,"2002-03-21").
The minutes of the meeting are the ones at node n1.
property_of(meeting,minutes,object(minutes)).
opv(o2,minutes,o3).
John and Paul attended; Pius was absent. They are all persons. Since the number of people who attend or are absent can in principle be arbitrarily large, we express attendance and non-attendance not as properties but as relations.
class(person).
property_of(person,name,string).

object(o4).
obj_class(o4,person). 
opv(o4,name,"John").

object(o5).
obj_class(o5,person).
opv(o4,name,"Paul").

object(o6).
obj_class(o6,person).
opv(o4,name,"Pius").

relation(person_attends_meeting,[person, meeting]).
relation(person_misses_meeting,[person, meeting]).
relation_applies(person_attends_meeting,o4,o2).
relation_applies(person_attends_meeting,o5,o2).
relation_applies(person_misses_meeting,o6,o2).
Defining classes for resolutions and action items:
class(resolution).
class(action).
property_of(resolution, meeting, object(meeting)).
property_of(resolution, text, string).

property_of(action, meeting, object(meeting)).
property_of(action, who, string).
property_of(action, when, string).
property_of(action, what, string).
An inverse relation:
relation(meeting_makes_resolution,[meeting,resolution]).
relation(meeting_assigns_action,[meeting,action]).
Thus armed, we can specify the resolution and action item agreed at this meeting:
object(o7). 
obj_class(o7,resolution).
opv(o7,text,"to increase the fine for missed meetings to a nickel.").
opv(o7,meeting,o2).
relation_applies(meeting_makes_resolution(o2,o7).

object(o8).
obj_class(o8,action).
opv(o8,who,"Paul").
opv(o8,when,"to order refreshments").
opv(o8,what,"26 March").
opv(o8,meeting,o2).
relation_applies(meeting_assigns_action(o2,o8).

object(o9).
obj_class(o9,date).
opv(o9,localform,"26 March").
relation_applies(action_duedate(o8,o9).

5.3. Generating the application sentences

[to be supplied]

5.4. World knowledge and further inferences

Some further inferences we make based on world knowledge:
  • Since there was a meeting organized enough to have written minutes, the chances are that this is a meeting of some standing organization. Counter-indices would be labels like ad hoc, open public meeting, etc., or absence of a list of people absent. (For purposes of reference, let us refer to this putative organization as O, and to the hypothesis of O's existence as H1.)
  • The membership of O on the date of the meeting was (probably) John, Paul, and Pius.
  • Meetings take place on a given date, and (for conventional meetings, not teleconferences) at a particular location. (Let's call the location L and the date D.)
  • Since they attended the meeting, John and Paul were presumably physically in location L on date D, at least at the time of the meeting.
  • Since he did not attend the meeting, Pius may be presumed not in location L on date D at the time of the meeting. Note that for purposes of this inference, location L must be taken in a very narrow sense, not a city or even a building, but a room.
  • Since they attended the meeting, John and Paul may be presumed alive on date D, at least at the time of the meeting.
  • Since he did not attend the meeting, we cannot infer with certainty that Pius was alive at the time of the meeting; since he is listed as absent, though, we may infer that he was expected.
  • Since Pius was expected at the meeting, we may infer that (as far as John and Paul knew) he was alive at the time of the meeting.[1]
  • The names of the attendees and absentee, and their propensity for falling into Latin, suggests that O is perhaps a committee or club[2] whose members served as popes sometime during [the middle decades of?] the twentieth century. Let us refer to this hypothesis as H2.[3]
  • If H2 is correct, then there is only one place (well, theoretically two places — could this explain why Benedict XV, Pius X and XI, and John Paul I are not listed either as present or as absent?) where O could be holding its meetings. [4] In this case, we can conclude that the document either has a miraculous provenance, or is a fiction.
These inferences rely not only on the application sentences we have generated from this document, but from real-world knowledge about meetings, the taking of minutes of meetings, etc. We do not know whether any knowledge base exists from which such inferences could be drawn; the use of application sentences for further inferences with the help of a general purpose knowledge base is one possible application of the system we have sketched out, but it lies, strictly speaking, beyond the scope of our project at present.

Notes

[1] Opinions differ about whether it is conceivable that someone known to be recently deceased could be listed under “Absent” in the minutes of a meeting.
[2] The nickel fine for absenteeism really does strongly suggest a club, as does the emphasis on refreshments. But this observation relies on an understanding of the text, not just on the application sentences derived from the markup.
[3] If H2 holds, then the name “Pius” is ambiguous: Pius X, XI, or XII. There are, of course, many other candidate identifications for these three names; even among the popes, there are other Johns, Pauls, Piuses. But it is not until the twentieth century that any three popes who took these three names were close enough in time to have known each other. But see also below.
[4] Note that John XXIII did not have the name John while Pius XII was alive, and Paul VI only took that name after both the others are dead. Note also that on the purported date, none of the three were alive on earth.