• String-Based Macro Systems

    From Lawrence D'Oliveiro@21:1/5 to All on Sat Apr 13 02:29:56 2024
    I think most of us are familiar with the “#define” preprocessor in C and C++. There are more powerful macro processors around, like GNU m4. They
    all have the same basic concept: pass input text straight through to
    output, until something triggers a macro substitution on the text.

    The original m4 was created by the Unix folks at Bell Labs, modelled on an earlier concept called “Macrogenerator” by Christopher Strachey (one of
    the brains behind the programming language CPL, which led to BCPL, which
    led to B and then C). Macrogenerator had special symbols to indicate macro definition, and macro and argument expansion:

    §DEF,«name»,<«definition»>;

    where the “<” and “>” are actual quote symbols in the notation, while I use “«” and ”»” as metasyntactic brackets. Within the «definition», occurrences of “~1”, “~2” etc are replaced with the first, second etc actual argument specified in the call. You then use this macro as

    §«name»,«args»;

    where multiple arguments are comma-separated.

    Simple example: given

    §DEF,greetings,<Hello, ~1!>;

    then

    I would just like to say, “§greetings,world;” to anybody listening

    should expand to

    I would just like to say, “Hello, world!” to anybody listening

    Here is a moderately interesting example, from the Bryan Higman book where
    I first heard about this. It uses a builtin called §UPDATE, which does assignment to an existing macro name, and also note the occurrence of
    §DEFs within §DEFs, for local (temporary) macro definitions (since the auxiliary macro §Q has to persist between invocations, it cannot be one of these):

    §DEF,Q,A;
    §DEF,AORB,<§§Q;;>,§DEF,A,<A§UPDATE,Q,B;>;,§DEF,B,<B§UPDATE,Q,A;>;;

    What this does is, each time you write “§AORB;”, it expands to alternately “A” or “B”.

    The big difference with m4 is that it does away with these special
    symbols; the mere occurrence of a name matching a defined macro (or an
    argument of the macro currently being expanded) is sufficient to trigger substitution. Do you think this is a good idea?

    There are all kinds of pitfalls with such macro systems. The original Macrogenerator could not cope with substitutions containing unpaired
    “< ... >” quote symbols, and even GNU m4 lacks something as simple as a backslash-style “escape next single character, whatever it is”. While m4 lets you switch the quoting symbols, it still insists that they occur in
    pairs.

    Would adding such an escape character be useful?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blue-Maned_Hawk@21:1/5 to Lawrence D'Oliveiro on Sat Apr 13 05:09:26 2024
    Lawrence D'Oliveiro wrote:

    The big difference with m4 is that it does away with these special
    symbols; the mere occurrence of a name matching a defined macro (or an argument of the macro currently being expanded) is sufficient to trigger substitution. Do you think this is a good idea?

    There are all kinds of pitfalls with such macro systems. The original Macrogenerator could not cope with substitutions containing unpaired “<
    ... >” quote symbols, and even GNU m4 lacks something as simple as a backslash-style “escape next single character, whatever it is”. While m4 lets you switch the quoting symbols, it still insists that they occur in pairs.

    Would adding such an escape character be useful?

    Yes, of course.

    Whenever a system has a system to escape symbols, there are two ways to go about it: either the symbol is magic by default, and the escape makes it normal, or the symbol is normal by default, and the escape makes it magic.

    Having both of the systems at once is generally confusing, because it
    makes it difficult to remember which symbols are which. It's more
    practical to have all of them be one or the other.

    One could say that having the symbols only become magic upon escapement is better, because it clearly indicates when a symbol has magic properties.
    This is analogous to the logic used to defend sigils, a form of
    disambiguation repeatedly found to be pointless because names already do
    that disambiguation. Therefore, the correct choice is magic by default.

    One fallacious argument i've heard used to justify magic by default is
    that it means that the treatment of the escape symbol itself is consistent
    with all the other symbols in that it's magic by default unless escaped by itself. I consider this fallacious because in a system where magic must
    be explicit, the escape symbol would be the _only_ exception, and it would
    be _impossible_ to make any others—what i'd say is a worthwhile sacrifice.

    Either way, figuring out the solution to the problem of “Magic: by
    default or by request?” is almost certainly a lower priority than the majority of other problems.

    --
    Blue-Maned_Hawk│shortens to Hawk│/blu.mɛin.dʰak/│he/him/his/himself/Mr. blue-maned_hawk.srht.site
    (?<sigil> [&*\$\@\%])

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Sat Apr 13 05:51:27 2024
    On Sat, 13 Apr 2024 05:09:26 -0000 (UTC), Blue-Maned_Hawk wrote:

    Whenever a system has a system to escape symbols, there are two ways to
    go about it: either the symbol is magic by default, and the escape
    makes it normal, or the symbol is normal by default, and the escape
    makes it magic.

    And here’s another question: is magic iterative? Is text produced by a
    macro substitution automatically subject to further macro substitutions?

    This is true of Macrogenerator and m4, but perhaps this is a source of a
    lot of the problems with string-based macro systems.

    On the other hand, if you didn’t do this, then how would you implement the example I gave?

    §DEF,AORB,<§§Q;;>,§DEF,A,<A§UPDATE,Q,B;>;,§DEF,B,<B§UPDATE,Q,A;>;;

    If “§A;” expands literally to “A§UPDATE,Q,B;” with no further special interpretation of the embedded “§”, then how would you explicitly request invocation of the “UPDATE” function?

    The answer would be, the body of the macro would not directly be
    interpreted as literal text, but would have to consist of a sequence of explicit directives, like “insert literal text”, “insert expansion of a further macro” and so on.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)