Grail for regular languages

[11 December 2009]

Every now and then — not constantly, but recurrently — I experience a strong need to have a running copy of Grail, a software package first written by Derick Wood and Darrell Raymond and described by its documentation as “a symbolic computation environment for finite-state machines, regular expressions, and other formal language theory objects.”

Among other things, Grail is handy for answering questions about the equivalence or non-equivalence of regular expressions, or about subset/superset relations holding between the languages recognized by them. A few years ago, for example, the W3C XML Schema Working Group found itself in possession of two different descriptions of the lexical space of the XSD duration type. The working group wished, not unreasonably, to check that the two really were equivalent.

The first description provided three regular expressions, and said the lexical space of duration included all the strings which matched all three expressions:

  • -?P([0-9]+Y)?([0-9]+M)?([0-9]+D)?(T([0-9]+H)?([0-9]+M)?([0-9]+(.[0-9]+)?S)?)? (strings in which the fields of an ISO 8601 duration appear in the correct order, and in which each field appears only if it has at least one digit present)
  • .*[YMDHS].* (strings in which at least one field is present)
  • [^T]+(T[^HMS]+[HMS].*)? (if the character T appears, it must be followed by one of the time-related fields)

The second description translated the context-free grammar into regular-expression form (I’ve introduced white space for legibility):

-?P(([0-9]+Y)([0-9]+M)?([0-9]+D)?
(T(([0-9]+H)([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+(.[0-9]+)?S)))?
|([0-9]+M)([0-9]+D)?
(T(([0-9]+H)([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+(.[0-9]+)?S)))?
|([0-9]+D)?
(T(([0-9]+H)([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+(.[0-9]+)?S)))?
|T(([0-9]+H)([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+M)?([0-9]+(.[0-9]+)?S)?
|([0-9]+(.[0-9]+)?S)))

Easy enough to eyeball, for some people, I guess, but the working group wanted a more reliable method.

After a few hours trying vainly to compile Grail for my Linux box, I found an RPM that worked for me, and in ten minutes or so I had used Grail to establish that the two descriptions are equivalent.

Today I realized that another problem I face could best be solved by using Grail, but I no longer have a Linux box (and have not, in any case, found that old RPM). Grail 2.5 is dated March 1996, and the C++ in which it is written does not seem to please GCC 4.0.1. Grail+ 3.0, a successor project in other hands, may have been touched as recently as 2002 or 2004, but most of the dates appear to be in summer or fall 1998. GCC doesn’t like it, either.

So I have thus far been unable to recompile this very helpful tool.

If anyone out there knows of anyone who has either massaged the source of Grail into a form more like what modern C++ compilers will compile, or found out what combination of compile-time flags will persuade GCC to put itself in a more forgiving frame of mind and compile the thing, please get in touch. (And no, -Wno-deprecated does not suffice to do the trick.)

And any C++ proficients looking for interesting and useful projects to undertake could do a lot worse for themselves and for the world than to bring Grail into the twenty-first century.

4 thoughts on “Grail for regular languages

  1. My proficiency in C++ is modest, but I do have too much time on my hands. So, I would be interested to see what I can do.

    However, I hesitate to put forth much effort because of the assertion that Grail+ is not free, merely gratis for academics and the just curious. The page does not say who owns the copyright, nor does it say anything about copyright on (the original) Grail.

    Do you have any thoughts on the matter?

    Cheers,
    Terry.

  2. The 2.5 source files contain a notice reading

    // This code copyright (c) by the Grail project.
    // No commercial use permitted without written consent.

    I assume the Grail project consists (or consisted), more or less, of Derick Wood and Darrell Raymond, though the institutions where they did the work may also have some claims. That means, I guess, that it was naive and over-simple to suggest as I did that anyone interested could and should consider updating the code. At the very least, before doing any work on the code with public distribution in view, one should get the blessing of the owners of the intellectual property rights.

    How licensing issues do complicate life!

    Murata Makoto (who first made me aware of Grail) has pointed out that several other packages for dealing with automata are available; at first glance, though, they all seem either aimed at providing a GUI interface or only to provide libraries to call from your own programs, none at the kind of pipelining for which the Grail tools were designed. Sigh.

Comments are closed.