Report on Technical Seminar in Lisbon
C. M. Sperberg-McQueen
24 July 1994
The editors of the TEI have just returned from Lisbon, where they gave a
two-and-a-half day technical seminar or workshop on SGML and the TEI for
a research group at the New University, with interests focused on
terminological databases, dictionaries, and corpus-based lexicography
and lexicology. For the work group, the workshop represented a chance
to learn more about SGML and the Text Encoding Initiative. For the
editors, it provided an opportunity to experiment with the design of a
multi-day workshop for specialist users.
On the first afternoon of the workshop, after clarifying organizational
questions, was devoted to an introduction to SGML and to the overal
architecture of the TEI. We began 'medias in res' by examining together
two sample SGML documents (a very simple 'Hello, world!' document, which
we examined in an ASCII editor, and a more realistic, though still
relatively simple, document in TEI markup, containing the proposed
syllabus for the workshop, which we examined in both an ASCII editor and
in an SGML editor which provided nice on-screen formatting for the
document). Following this, LB presented some basic philosophical and
practical observations on the nature and necessity of markup, and the
design and goals of the TEI document type definitions. MSM then
outlined the process of document analysis with which any serious project
in electronic text encoding needs to begin, and we ended the day by
applying that process to a sample text from the text-base being built in
Lisbon for terminological work.
On the second day, we began with a description of corpus construction
and TEI facilities for corpus planning and documentation, with a side
view on linguistic annotation as practiced by current corpora such as
the British National Corpus. The rest of the morning went to a survey
of the TEI tag set for terminological databases, which is the basis of
current work in ISO Technical Committee 37, and of fairly direct
relevance to the work of the host research group. After lunch, we
continued by tagging, with TEI markup, a portion of the document we had
analysed the previous afternoon, and then the participants in the
seminar were turned loose on machines equipped with Author/Editor and a
selection of pre-compiled versions of the TEI DTD.
The final day of the workshop was devoted to dictionary encoding, with
examples from Portuguese and French dictionaries, to a demonstration of
SARA, the SGML-aware interactive concordance software being developed
for the British National Corpus, and to more hands-on work. We finished
with a plenary discussion of the workshop, in which the participants
gave the editors a number of very useful suggestions, which will, we
hope, benefit participants in future workshops.
We thank the research group and in particular Prof. Theresa Lino for
their invitation and kind hospitality --- and for their patience with
our non-existent Portuguese and imperfect French (on average, that is:
LB's French is impeccable, MSM's is, well, highly peccable). Thanks are
also due to Softquad, for a set of temporary licenses for Author/Editor,
and to the British National Corpus for authorizing the demonstration of
SARA.
Research groups, professional societies, or others interested in
organizing workshops on the use of SGML and the TEI in their particular
fields should contact the editors. We are in the process of preparing a
workshop this fall, in which we will prepare a number of individuals to
teach such workshops, and we hope to be in a position to accommodate as
many such requests as humanly possible in the next couple of years.