<p><del>It was be</del> <del>For</del> When we applied to Your Excellency for leave to adjourn it was because we foresaw that we <del>were</del> <add>should continue</add> wasting our own time ... </p>From the del elements, the reader of the document is licensed to infer that the letters It was be, For, and were are marked as deleted; from the add element, the reader may infer that the words should continue have been added. Software might rely on these inferences in the course of making a concordance or displaying a clear text; human readers will rely on them in interpreting the historical document. Note that the markup here stops short of licensing the inference that should continue was substituted for were. The editors could license that inference as well by appropriate markup, if they wished. Human readers may make the inference on their own, given the linguistic context; software cannot safely infer a substitution every time an addition is adjacent to a deletion.
node([1,5,2],element(p)). node([1,5,2,1],element(del)). node([1,5,2,1,1],pcdata("I")). node([1,5,2,1,2],pcdata("t")). node([1,5,2,1,3],pcdata(" ")). node([1,5,2,1,4],pcdata("w")). node([1,5,2,1,5],pcdata("a")). node([1,5,2,1,6],pcdata("s")). node([1,5,2,1,7],pcdata(" ")). node([1,5,2,1,8],pcdata("b")). node([1,5,2,1,9],pcdata("e")). node([1,5,2,2],pcdata(" ")). node([1,5,2,3],element(del)). node([1,5,2,3,1],pcdata("F")). node([1,5,2,3,2],pcdata("o")). node([1,5,2,3,3],pcdata("r")).The first argument of the node predicate is a numeric path expression, a sequence of numbers representing the position of the node in the tree. The path [1,5,2] denotes the second child of the fifth child of the root element; its children are [1,5,2,1], [1,5,2,2], etc. The second argument is a structure, either a term with the functor element and an argument showing the generic identifier of the element's type, or else a term with the functor pcdata with an argument of a single character. The facts above can be read "There is a node at location 1.5.2, and it is an element of type p," "There is a node at location 126.96.36.199.1, and it is the character I," and so on. 
attr([1,5,2],id,implied). attr([1,5,2],n,implied). attr([1,5,2],lang,implied). attr([1,5,2],rend,implied). attr([1,5,2],teiform,"p").The first argument is the path expression for the node; the second is the attribute name, the third is either the keyword implied or else the value represented as a quoted string. These facts can be read "The element at node 1.5.2 has an attribute named id, the value of which is implied," "The element at 1.5.2 has an attribute named teiform, the value of which is the string "p".
p([1,5,2]). del([1,5,2,1]). del([1,5,2,3]). add([1,5,2,97]).This notation is convenient for checking to see whether a known location has some specific property, or for finding all the locations which have some specific property. It is less convenient for finding all the properties which apply to some known location, however, so in practice we will express these facts in a different notation:
property_applies(p,[1,5,2]). property_applies(del,[1,5,2,1]). property_applies(del,[1,5,2,3]). property_applies(add,[1,5,2,97]).In this notation, the first argument is the name of the property predicated of some location; the second argument is the path expression for that location. These facts can be read "The property p is predicated of node 1.5.2," and so on.
english().to record the fact that the document is in English.
language(,english). property_applies(language,,english).In order to have a closer parallel with the one-argument properties, however, we choose yet another form for these facts:
infer(Property,Loc) :- node(Loc,element(Property)). infer(Property,Loc) :- node(Anc,element(Property)), descendant(Loc,Anc).A property can be inferred for a particular location if that location is an element and its generic identifier is the name of the property (or in other terms, if it is directly predicated for that location). It can also be inferred for that location, if it is predicated for some other location of which is that location is a descendant. Checking for the ancestor-descendant relation is simple, using path expressions: if the second path is a prefix of the first path, the two paths denote a descendant and an ancestor.
descendant([H,_|_],[H]). descendant([H|TDesc],[H|TAnc]) :- descendant(TDesc,TAnc).(Roughly: the first argument denotes a descendant of the second argument, if the two paths start with the same and the second one ends first.)
infer(Prop,Loc) :- attr(Loc,Att,Val), not(Val = implied), Prop =.. [Att,Val]. infer(Prop,Loc) :- attr(Anc,Att,Val), not(Val = implied), Prop =.. [Att,Val], descendant(Loc,Anc).(Infer a property Prop for location Loc if Loc has an attribute Att with value Val, and Prop has the form Att(Val).)
?- infer(Property,[1,5,2]). Property = p ->; Property = doc ->; Property = docbody ->; Property = teiform() ->; Property = id([72,76,49,48,51,48,53]) ->; Property = lang([101,110,103]) ->; no ?-The lists of numbers are the default Prolog representation for the strings "p", "HL10305", and "eng". In this case, the paragraph in our example has the properties p (paragraph), doc (document), docbody (body of the document), teiform("p"), id("HL10305"), and lang("eng").
?- infer(Property,[1,5,2|Tail]). Property = p Tail =  ->; Property = del Tail =  ->; Property = del Tail =  ->; Property = del Tail =  ->; Property = add Tail =  ->; Property = person Tail =  ->; Property = del Tail =  ->; Property = add Tail =  ->The inquiry asks to see what properties are associated with any node which begins with the prefix [1,5,2]; the results show the continuation of the path (the Tail) and the property.
?- infer(del,Loc). Loc = [1,5,2,1] ->; Loc = [1,5,2,3] ->; Loc = [1,5,2,95] ->; Loc = [1,5,2,318] ->; Loc = [1,5,2,348] ->; Loc = [1,5,2,717] ->; Loc = [1,5,2,719,57] ->; Loc = [1,5,2,866] ->; Loc = [1,5,2,917] -> ...
<P>Reader, I married him.</P>we can infer the existence of one paragraph, but we cannot infer that the word Reader is itself a paragraph. We can, however, infer that it has the property of being within a paragraph.
<hi rend="gothic">And this Indenture further witnesseth</hi> that the said <hi rend="italic">Walter Shandy</hi>, merchant, in consideration of the said intended marriage ...On the straw-man model, we can infer both that the words And this are rendered in black-letter (`gothic') type and that the word Indenture is similarly rendered. I.e. the example as given above is strictly synonymous with the following example: 
<P><HI REND="gothic">And</HI> <HI REND="gothic">this</HI> <HI REND="gothic">Indenture</HI> <HI REND="gothic">further</HI> <HI REND="gothic">witnesseth</HI> that the said <HI REND="italic">Walter Shandy</HI>, merchant, in consideration of the said intended marriage ... </P>It makes no difference whether the phrase And this Indenture further witnesseth occurs in one or five hi elements: the property of typographic highlighting is distributed equally among each word (in fact, each letter) of the contents. It is as true to say "The word And is in black-letter" as to say "The phrase And this indenture further witnesseth is in black-letter."
<doc lang="en"> <p>Wittgenstein wrote: <q lang="de"><ital>Die Welt ist alles, was der Fall ist.</ital></q> It is hard to escape, at first reading, the suspicion that Wittgenstein is guilty here of a gross platitude; it is only after reading the rest of the <title lang="la">Tractatus</title> that on returning to its famous first sentence one appreciates the depths of its intension.</p> </doc>Given the definition of lang above, we are licensed to infer, from this document, that the contents of the doc element (path 1) are in English, and that the contents of the q element (path 1.1.22) are in German:
?- infer(lang("en"),). yes ?- infer(lang("de"),[1,1,22]). yes ?-
?- infer(lang("en"),[1,1,22]). yes ?-
property_applies(title_of([1,2,5,3]),[1,2,5,3,4]).in which the property predicated of element 188.8.131.52.4 is the property "being the title of the bibliographic item represented by element 184.108.40.206". This method is well known from functional programming; it amounts to replacing an n-ary function with a function of arity n-1 which in turn returns a unary function. We have split off the first argument of the title_of predicate; we could equally well split off the other, to yield
property_applies(has_title([1,2,5,3,4]),[1,2,5,3]).which predicates, of element 220.127.116.11, the property of "being a bibliographic item whose title is in element 18.104.22.168.4".
[Biggs and Huitfeldt 1997] Biggs, Michael, and Claus Huitfeldt. 1997. "Philosophy and Electronic Publishing. Theory and Metatheory in the Development of Text Encoding". The Monist 80 no. 3: 348-367. http://hhobel.phl.univie.ac.at/mii
[Laurens 1985] [Laurens, Henry]. 1985. "Commons House of Assembly to Lord William Campbell." The Papers of Henry Laurens, ed. David R. Chesnutt et al. (Columbia, S.C.: University of South Carolina Press, 1985) Vol. 10, pp. 305-308.
[Ramalho et al. 1999] Ramalho, José Carlos, Jorge Gustavo Rocha, José João Almeida, and Pedro Henriques. 1999. SGML documents: Where does quality go? Markup Languages: Theory & Practice 1.1 (1999): 75-90.
[Renear et al. 1995] Renear, Allen, David G. Durand, and Elli Mylonas. 1995. "Refining our notion of what text really is: the problem of overlapping hierarchies." In Research in Humanities Computing (Oxford: Oxford University Press, 1995). Originally delivered at ALLC/ACH '92.
[Welty and Ide 1999] Welty, Christopher, and Nancy Ide. 1999. "Using the Right Tools: Enhancing Retrieval from Marked-up Documents." CHum 33 (1999): 59-84. Originally delivered at TEI 10, Providence (1997).