• catcode questions

    From =?UTF-8?Q?Fran=C3=A7ois_Patte?=@21:1/5 to All on Thu Jan 16 14:44:41 2025
    Bonjour,

    For typographic purposes, I need to make “~” a normal character,
    redefine “_” and choose another non-breaking space, I've chosen: “¬”.

    I then define an environment where these rules apply:

    \catcode`\_=13 %
    \catcode`\¬=13
    \newenvironment{toto}{%
    \catcode`\~=12 %
    \catcode`\¬=13%
    \def¬{\kern 1ex}%
    \catcode`\_=13 %
    \def_{}%
    ................. et beaucoup d'autres choses....
    }%
    {%
    \catcode`\_=8%
    \catcode`\¬=12%
    }%
    \catcode`\_=8%
    \catcode`\¬=12%

    It works, but it seems redundant: I can't define \catcode`\¬=13,
    \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
    latex will protest that a “control sequence” is missing, hence the \catcode`\_=13 % \catcode`\¬=13 before the environment.

    Likewise, putting their initial catcodes only in the end-of-environment declaration isn't enough either.... hence the redundancy after the
    environment definition.

    Hence my question: is this a normal way of proceeding or is there a more orthodox way?

    I repeat: it works the way I want it to and, so far, I haven't had any
    side effects.

    Thank you for your advice.

    F.P.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julian Bradfield@21:1/5 to [email protected] on Thu Jan 16 17:18:17 2025
    On 2025-01-16, François Patte <[email protected]> wrote:
    Bonjour,

    For typographic purposes, I need to make “~” a normal character,
    redefine “_” and choose another non-breaking space, I've chosen: “¬”.

    I then define an environment where these rules apply:

    \catcode`\_=13 %
    \catcode`\¬=13
    \newenvironment{toto}{%
    \catcode`\~=12 %
    \catcode`\¬=13%
    \def¬{\kern 1ex}%
    \catcode`\_=13 %
    \def_{}%
    ................. et beaucoup d'autres choses....
    }%
    {%
    \catcode`\_=8%
    \catcode`\¬=12%
    }%
    \catcode`\_=8%
    \catcode`\¬=12%

    It works, but it seems redundant: I can't define \catcode`\¬=13,
    \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
    latex will protest that a “control sequence” is missing, hence the \catcode`\_=13 % \catcode`\¬=13 before the environment.

    That's because catcodes are assigned at the early stage of processing
    (what Knuth calls TeX's mouth). So to be able to write \def¬ , the ¬
    has to have catcode 13 at the place you write \def¬ .
    It's often irritating, but that's the way it is.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Fran=C3=A7ois_Patte?=@21:1/5 to All on Thu Jan 16 22:33:49 2025
    Le 16/01/2025 à 18:18, Julian Bradfield a écrit :
    On 2025-01-16, François Patte <[email protected]> wrote:
    Bonjour,

    For typographic purposes, I need to make “~” a normal character,
    redefine “_” and choose another non-breaking space, I've chosen: “¬”.

    I then define an environment where these rules apply:

    \catcode`\_=13 %
    \catcode`\¬=13
    \newenvironment{toto}{%
    \catcode`\~=12 %
    \catcode`\¬=13%
    \def¬{\kern 1ex}%
    \catcode`\_=13 %
    \def_{}%
    ................. et beaucoup d'autres choses....
    }%
    {%
    \catcode`\_=8%
    \catcode`\¬=12%
    }%
    \catcode`\_=8%
    \catcode`\¬=12%

    It works, but it seems redundant: I can't define \catcode`\¬=13,
    \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
    latex will protest that a “control sequence” is missing, hence the
    \catcode`\_=13 % \catcode`\¬=13 before the environment.

    That's because catcodes are assigned at the early stage of processing
    (what Knuth calls TeX's mouth). So to be able to write \def¬ , the ¬
    has to have catcode 13 at the place you write \def¬ .
    It's often irritating, but that's the way it is.

    May I consider that my syntax is correct?

    F.P.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich D i e z@21:1/5 to All on Fri Jan 17 03:08:11 2025
    François Patte schrieb:

    Bonjour,

    For typographic purposes, I need to make “~” a normal character,
    redefine “_” and choose another non-breaking space, I've chosen: “¬”.

    I then define an environment where these rules apply:

    \catcode`\_=13 %
    \catcode`\¬=13
    \newenvironment{toto}{%
    \catcode`\~=12 %
    \catcode`\¬=13%
    \def¬{\kern 1ex}%
    \catcode`\_=13 %
    \def_{}%
    ................. et beaucoup d'autres choses....
    }%
    {%
    \catcode`\_=8%
    \catcode`\¬=12%
    }%
    \catcode`\_=8%
    \catcode`\¬=12%

    It works, but it seems redundant: I can't define \catcode`\¬=13,
    \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
    latex will protest that a “control sequence” is missing, hence the \catcode`\_=13 % \catcode`\¬=13 before the environment.

    Likewise, putting their initial catcodes only in the end-of-environment declaration isn't enough either.... hence the redundancy after the environment definition.

    Hence my question: is this a normal way of proceeding or is there a more orthodox way?

    I repeat: it works the way I want it to and, so far, I haven't had any
    side effects.

    Thank you for your advice.

    F.P.
    Using the non-ascii-character "¬" ?

    What is the input encoding of your .tex file?

    If the input-encoding is utf8 and you don't use a TeX engine with native
    utf8 support (XeTeX, LuaTeX) but use a traditional 8-bit-TeX engine
    (TeX, pdfTeX) and the package inputenc with option "utf8", this might be
    a problem as ¬ has code-point-number 172(decimal) in unicode and in the transformation-format utf-8 is encoded via the two bytes C2(hex) AC(hex)
    which by traditional TeX engines are interpreted as _two_ input characters.

    The backslash has code-point-number 5C(hex)=92(decimal) in unicode and
    in the transformation-format utf-8 is encoded via the single byte 5C .

    Thus s.th. like \¬ is interpreted as three bytes/characters 5C C2 AC.
    The first character is the backslash which has category 0(escape).
    The second and the third characters, when the inputenc-package is loaded
    for interpreting utf8-input, are of category 13(active).

    Thus tokenizing this yields a control symbol token whose name is formed
    by the character whose code-point-number in TeX's internal character representation scheme is C2(hex)=194(dec) and an active character token
    whose character code is AC(hex)=172(decimal). As the byte AC cannot be
    the first byte of a character encoded in transformation-format utf8,
    active AC triggers an error-message.

    If
    - either using a TeX-engine with native utf8-support, like XeTeX/LuaTeX,
    and encoding the .tex-input file in utf-8,
    - or using a traditional 8bit-TeX engine, like TeX or pdfTeX, and
    encoding the .tex-input-file in some single-byte-encoding like
    iso-8859-1 or Windows-1252
    , then you can try \lccode/\lowercase-trickery:



    \documentclass{article}

    \newcommand\MyActivate[2]{%
    \begingroup
    \lccode`\~ =`#1 %
    \lowercase{\endgroup\def~}{#2}%
    \catcode`#1 =13 %
    }%

    \newenvironment{toto}{%
    \MyActivate{\¬}{\kern 1ex}%
    \MyActivate{\_}{}%
    \catcode`\~=12\relax
    }{}%

    \begin{document}

    \message{^^JBefore toto^^J}

    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~

    \begin{toto}
    \message{^^JWithin toto^^J}
    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~
    \end{toto}

    \message{^^JAfter toto^^J}
    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~

    \end{document}



    Sincerely

    Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich D i e z@21:1/5 to All on Fri Jan 17 16:18:07 2025
    Things like `\A and `A are alphabetic constants.

    Pitfalls with alphabetic constants are:

    An alphabetic constant formed by a one-letter control sequence token
    cannot be passed as macro-argument in case the
    one-letter control sequence token is \outer unless the
    one-letter control sequence token "hit" by \noexpand.

    Forming an alphabetic constant from a character-token does not work
    out in case at the time of tokenization the category of the character
    is 0 or 5 or 9 or 14 or 15.

    Assuming % has category 14 and thus is a comment-character,
    something like

    \catcode`%=12

    does not work out.

    But

    \catcode`\%=12

    does work out.

    However, assuming \% is defined \outer, using the token \% within a
    macro argument can only be done when previously it is "hit" by
    \noexpand .


    The following variant of \MyActivate might be more flexible:

    Syntax:

    \MyActivate{<One letter control sequence whose
    name is the character to activate>}%
    {<Tokens where #1 is to be replaced by the
    corresponding active character token>}%

    The braces surrounding the arguments are mandatory. (Instead of
    characters { and } you can use any other character of category 1
    resp. 2 .)

    <One letter control sequence whose name is the character to activate>
    may be \outer at the time of carrying out the environment toto, but
    due to \newenvironment being a macro may not be \outer at the time of
    defining the environment toto.

    The corresponding active character may be \outer at the time of
    defining and at the time of carrying out the environment toto.

    When using \MyActivate inside macro- or environment-definitions, hashes
    of #1 denoting the corresponding active character token need to be doubled .

    If <Tokens where #1 is to be replaced by the corresponding active
    character token> is used for defining a macro with parameter-text,
    hashes belonging to macro parameters need to be doubled.



    \documentclass{article}

    \newcommand\MyGobble[1]{}%
    \newcommand\MyActivate{%
    % The one-letter control sequence might be \outer.
    % So "hit" the one-letter control sequence with
    % \noexpand before expanding \MyActivateB - this
    % requires some brace-hacking:
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\MyActivateB
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter{%
    \expandafter\expandafter\expandafter\expandafter
    \expandafter\expandafter\expandafter\noexpand
    \expandafter\expandafter\expandafter\iffalse
    \expandafter\expandafter\expandafter}%
    \expandafter\expandafter\expandafter\fi
    \expandafter\MyGobble
    \string
    }%
    \newcommand\MyActivateB[2]{%
    \begingroup
    \lccode`\~ =`#1 %
    \long\def\temp##1{\endgroup#2}%
    \lowercase\expandafter{%
    \expandafter\expandafter
    \expandafter \temp
    \expandafter\expandafter
    \expandafter {%
    \expandafter\noexpand
    \noexpand~}%
    }%
    \catcode`#1 =13 %
    }%

    \newenvironment{toto}{%
    \MyActivate{\¬}{\def##1{\kern 1ex}}%
    \MyActivate{\_}{\def##1{}}%
    \catcode`\~=12 %
    }{}%

    \begin{document}

    %\outer\def\¬{outer macro}
    %\outer\def\~{outer macro}
    %\catcode`\~=13 \outer\def~{outer macro}
    %\catcode`\¬=13 \outer\def¬{outer macro}

    \message{^^JBefore toto^^J}

    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~

    \begin{toto}
    \message{^^JWithin toto^^J}
    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~
    \end{toto}

    \message{^^JAfter toto^^J}
    \showthe\catcode`\¬
    \showthe\catcode`\_
    \showthe\catcode`\~
    \show ¬
    \show _
    \show ~

    \end{document}


    Sincerely

    Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Fran=C3=A7ois_Patte?=@21:1/5 to All on Sat Jan 18 10:14:43 2025
    Le 17/01/2025 à 03:08, Ulrich D i e z a écrit :
    François Patte schrieb:

    Bonjour,

    For typographic purposes, I need to make “~” a normal character,
    redefine “_” and choose another non-breaking space, I've chosen: “¬”.

    I then define an environment where these rules apply:

    \catcode`\_=13 %
    \catcode`\¬=13
    \newenvironment{toto}{%
    \catcode`\~=12 %
    \catcode`\¬=13%
    \def¬{\kern 1ex}%
    \catcode`\_=13 %
    \def_{}%
    ................. et beaucoup d'autres choses....
    }%
    {%
    \catcode`\_=8%
    \catcode`\¬=12%
    }%
    \catcode`\_=8%
    \catcode`\¬=12%

    It works, but it seems redundant: I can't define \catcode`\¬=13,
    \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
    latex will protest that a “control sequence” is missing, hence the
    \catcode`\_=13 % \catcode`\¬=13 before the environment.

    Likewise, putting their initial catcodes only in the end-of-environment
    declaration isn't enough either.... hence the redundancy after the
    environment definition.

    Hence my question: is this a normal way of proceeding or is there a more
    orthodox way?

    I repeat: it works the way I want it to and, so far, I haven't had any
    side effects.

    Thank you for your advice.

    F.P.

    Thank you for answering.

    Using the non-ascii-character "¬" ?

    What is the input encoding of your .tex file?

    I use xelatex. So, as far as I understand your explanations (I am not conversant with programmation...), using ¬, _, etc. is possible for my
    purpose and as I said my environment construction works as I wish.

    What I am not sure about is about the change of catcode --- which works
    --- but is it the correct way to do?

    Anyway your newcommand "MyActivate" does the job; it is easier than my construction, is it safer too?

    I don't understand your example: it produce no output and stops ar every "\showthe\catcode`\"
    How to use it?

    Thank you.

    F.P.

    If
    - either using a TeX-engine with native utf8-support, like XeTeX/LuaTeX,
    and encoding the .tex-input file in utf-8,
    - or using a traditional 8bit-TeX engine, like TeX or pdfTeX, and
    encoding the .tex-input-file in some single-byte-encoding like
    iso-8859-1 or Windows-1252
    , then you can try \lccode/\lowercase-trickery:



    \documentclass{article}

    \newcommand\MyActivate[2]{%
    \begingroup
    \lccode`\~ =`#1 %
    \lowercase{\endgroup\def~}{#2}%
    \catcode`#1 =13 %
    }%

    \newenvironment{toto}{%
    \MyActivate{\¬}{\kern 1ex}%
    \MyActivate{\_}{}%
    \catcode`\~=12\relax
    }{}%

    \begin{document}

    \message{^^JBefore toto^^J}

    \showthe\catcode`\¬
  • From Ulrich D i e z@21:1/5 to All on Sat Jan 18 15:19:06 2025
    François Patte wrote:

    I use xelatex. So, as far as I understand your explanations (I am not conversant with programmation...), using ¬, _, etc. is possible for my purpose

    Yes. ;-)

    and as I said my environment construction works as I wish.

    Yes. But be aware that none of the tokens within the arguments of your
    command \newenvironment{toto}{...}{...} may be defined \outer.

    So probably making the characters _ and ¬ active before doing
    \newenvironment is not enough but you probably also need to ensure that
    these active characters and the one-letter control-sequences \~ and \¬
    and \_ also are not already defined \outer before defining the
    environment by doing the call to \newenvironment.

    What I am not sure about is about the change of catcode --- which works
    --- but is it the correct way to do?

    As long as at the time of defining the environment "toto"/at the time of
    doing the call to \newenvironment{toto}{...}{...} the two active-character-tokens _ and ¬ and the three one-letter control-sequence-tokens \~ and \¬ and \_ are not already defined \outer,
    your way of doing things is a correct way of doing things.
    These five tokens occur within the arguments of the call to the macro \newenvironment while \newenvironment is a macro and arguments of macros
    and the like cannot contain tokens that are defined \outer.

    I provided \MyActivate only so that you don't need to also change
    category codes of _ and ¬ before (and after) defining the environment
    "toto" because in your initial posting you indicated that you don't like
    this redundancy.

    The gist is that at the time of defining \MyActivate the tilde, ~, is
    active. So any tilde that goes into the definition of \MyActivate is an
    active character token.
    So at the time of carrying out \MyActivate, after assigning --- via the directive "\lccode`\~ = ...", which comes from the definition of
    \MyActivate --- to the tilde a lowercase-code-number which equals the
    number of the character in TeX's internal-character-encoding scheme
    whose active pendant you wish to obtain, the directive "\lowercase{~}",
    which also comes from the definition of \MyActivate, yields the active
    pedant of that character.

    Anyway your newcommand "MyActivate" does the job; it is easier than my construction, is it safer too?

    I think the code in my first posting is a little less safe than your construction.

    I think the code in my second posting is a little safer than your
    construction.

    The restrictions at the time of defining the environment "toto" and at
    the time of carrying out the environment "toto" are slightly different:

    With your code, at the time of defining the environment "toto", the
    active characters _ and ¬ and the one-letter control-sequences \¬ and \_
    and \~ are not allowed to be defined \outer.

    With the code of my first posting, at the time of defining the
    environment "toto", the one-letter control-sequences \~ and \¬ and \_
    and active ~ are not allowed to be defined \outer, but the active
    characters _ and ¬ are allowed to be \outer. The active character ~ also
    is not allowed to be \outer at the time of carrying out an instance of
    the environment "toto"!

    With the code of my second posting, at the time of defining the
    environment "toto", the one-letter control-sequences \~ and \¬ and \_
    are not allowed to be defined \outer, but the active characters _ and ¬
    and ~ are allowed to be \outer. The active character ~ also is allowed
    to be \outer at the time of carrying out an instance of the environment
    "toto".

    I provided \MyActivate only so that you don't need to also change
    category codes of _ and ¬ before (and after) defining the environment
    "toto" because in your initial posting you indicated that you don't like
    this redundancy.

    I don't understand your example: it produce no output and stops ar every "\showthe\catcode`\"
    How to use it?

    The environment "toto" is used as

    \begin{toto}
    ...
    <stuff where toto's catcode-settings and toto's definitions of active characters are in effect>
    ...
    \end{toto}


    In my example \show and \showthe are used so that you can check/compare
    the settings in effect before and after the toto-environment/to the
    settings in effect inside the toto-environment:

    \show writes the current meaning of the token that follows \show both to
    the screen/console/window of the shell or command-prompt and to the
    .log-file. The word "meaning" has a special meaning in TeX-jargon. The "meaning" of a token is information about what that token is - whether
    it is a character token. If so, what category. Whether it is a control
    sequence token. If so: Is it a primitive? Is it a register? Is it a \toksdef/\chardef/whatsoeder-def-thingie? Is it a macro? In case it is a
    macro, the definition's parameter text and the definition's replacement
    text are written also.
    Compilation is intercepted until at the ?-prompt you press the return
    key. After pressing the return key, compilation is continued. This way
    you have time to look at what is written by \show to the screen/to the console/to the window where your shell or command-prompt is displayed.

    \showthe needs to be followed by a token denoting a register or a
    parameter or by something else whose current value can be accessed via
    \the. \showthe writes that current value in the same way in which \show
    writes the current meaning.

    (Whether interception of compilation and the need of pressing the
    retun key for continuing occurs actually depends on the
    interaction-mode, i.e., on whether TeX is run in errorstopmode/scrollmode/nonstopmode/batchmode.
    But in any case the .log-file can be inspected.)


    My example in the preamble of the document defines an environment "toto"
    so that (only) during carrying out an instance of that environment
    category codes of _ and ¬ and ~ are changed and active _ and active ¬
    are redefined.

    With my example, after the preamble, within the document-environment,
    the "toto"-environment is used. Outside and inside the
    "toto"-environment some \showthe- and \show-commands are issued so that
    on the screen/console/window of the shell or command-prompt and in the .log-file you can see and compare what settings are in effect before/inside/after that environment:

    Within the document-environment, before calling an instance of the
    environment "toto", i.e., outside any environment "toto" - via \showthe
    the current category codes of _ and ¬ and ~ and via \show the current meanings/definitions of _ and ¬ and ~ are displayed on the screen/ console/window the shell shell or command-prompt and are written to the .log-file. So by looking at what was written you can see the settings in
    effect at the time of carrying out \showthe/\show.

    Then via \begin{toto} an instance of the environment "toto" is started.
    (Be aware that an instance of an environment like "toto", i.e., \begin{toto}...\end{toto} also forms a local scope. This is because the
    macro \begin beneath other things starts a local scope and the macro
    \end beneath other things closes the local scope.)

    Within that instance of the environment "toto"/within the local scope
    formed by that instance of the environment "toto" the category-codes of
    _ and ¬ and ~ are changed and active _ and active ¬ are redefined.

    Inside that instance of the environment "toto"/within the local scope
    formed by that instance of the environment "toto" \showthe and \show are
    used again for again displaying on the screen/console/window of the
    shell or command-prompt and writing to the .log-file the current
    category codes of _ and ¬ and ~ and via \show again displaying on the screen/console/window of the shell or command-prompt and writing to the .log-file the current meanings/definitions of _ and ¬ and ~.

    After \end{toto} / when the local scope formed by that instance of the environment "toto" is closed, changes to the category codes of _ and ¬
    and ~ and meanings of active _ and ¬ and ~ are not in effect any more.
    Via \showthe the current category codes of _ and ¬ and ~ and via \show
    the current meanings/definitions of _ and ¬ and ~ are displayed on the screen/console/window of the shell or command-prompt and writing to the .log-file again.

    I interspersed this with \message{^^JBefore toto^^J} /
    \message{^^JWithin toto^^J} / \message{^^JAfter toto^^J} so that when
    looking at the output displayed on the screen/console/window of the
    shell or command-prompt and/or written to the .log-file, you can see
    more easily which settings for the category codes of the characters _
    and ¬ and ~ and which meanings for the characters _ and ¬ and ~ were effective before/within/after the environment "toto" and thus by looking
    at the output displayed on the screen/console/window of the shell or command-prompt and/or written to the .log-file can trace and verify that
    the changes to category codes and (re)definitions of active characters
    done by the environment "toto" indeed are effective only within the
    environment "toto".

    Sincerely

    Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)