• Interval Comparisons

    From Lawrence D'Oliveiro@21:1/5 to All on Tue Jun 4 07:14:02 2024
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Jun 4 10:58:53 2024
    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code? The answer is no, C treats it as the expression "(a <= x) <= b". So you would be changing the meaning of
    existing C code. I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C. (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mikko@21:1/5 to David Brown on Tue Jun 4 12:13:15 2024
    On 2024-06-04 08:58:53 +0000, David Brown said:

    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code? The answer is no, C treats it as the expression "(a <= x) <= b". So you would be changing the meaning of
    existing C code. I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C. (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

    return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    or

    return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.

    --
    Mikko

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Tue Jun 4 11:39:41 2024
    On 04/06/2024 08:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Yes it would break compatibility. The first '0 <= c' yields a 0 or 1 value.

    But Python can also do it as `c in range(0, 0x10FFFF+1)`.

    That could conceivably be added; the main obstacle would be introducing
    that new `in` keyword, while a better solution than `range` would be likely.

    The chances of it actually happening are infinitesimal, and I'd be long
    dead before it become widely available.

    This is the upside of devising your own language; I daily use these forms:

    a <= b <= c
    b in a .. c

    in my systems language. The only stipulation with the first form is that
    if there are any angle brackets, then they all point the same way,
    otherwise the result is too confusing.

    The language also needs to ensure middle terms of evaluated only once.

    If I ever want to have the C meaning of 'a <= b <= c' (say I'm porting
    some code), then it can be written like this to break it up:

    (a <= b) <= c

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Mikko on Tue Jun 4 13:02:03 2024
    On 04/06/2024 11:13, Mikko wrote:
    On 2024-06-04 08:58:53 +0000, David Brown said:

    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an
    interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code?  The answer is no, C treats it as the
    expression "(a <= x) <= b".  So you would be changing the meaning of
    existing C code.  I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current
    interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C.  (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

     return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    or

     return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.


    Sure, you can always add new things to a language if they would
    previously have been syntax errors or constraint errors. But is there a
    use for it?

    It is fine if you have a language that has good support for lists, sets, ranges, and other higher-level features - then an "in" keyword is a
    great idea. But C is not such a language, and that kind of feature
    would be well outside the scope of the language.

    It would be easy enough to write a macro "in_range(a, x, b)" that would
    do the job. It is even easier, and more productive, that you simply
    write the "valid_char" function and use it, if that's what you need.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Tue Jun 4 12:23:15 2024
    On 04/06/2024 12:02, David Brown wrote:
    On 04/06/2024 11:13, Mikko wrote:
    On 2024-06-04 08:58:53 +0000, David Brown said:

    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this >>>> from Python? Namely, the ability to check if a value lies in an
    interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code?  The answer is no, C treats it as the
    expression "(a <= x) <= b".  So you would be changing the meaning of
    existing C code.  I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current
    interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C.  (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

      return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    or

      return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.


    Sure, you can always add new things to a language if they would
    previously have been syntax errors or constraint errors.  But is there a
    use for it?

    It is fine if you have a language that has good support for lists, sets, ranges, and other higher-level features - then an "in" keyword is a
    great idea.  But C is not such a language, and that kind of feature
    would be well outside the scope of the language.

    I disagree. I have a script language where 'in' works with all sorts of
    data types, and where ranges like a..b and sets like [a..b, c, d, e] are
    actual types.

    Yet I also introduced 'in' into my systems language, even though it is
    very restricted:

    if a in b..c then
    if a in [b, c, d] then

    This is limited to integer types. The set construct here doesn't allow
    ranges (it could have done). Neither the range or set is a datatype - it
    just syntax. (I can't do range r := 1..10.)

    It is incredibly useful:

    if c in [' ', '\t', '\n'] then ... # whitespace
    if b in 0..255 then
    if b in u8.bounds then # alternative

    Not to forget:

    if x = y = 0 then # both x and y are zero

    It doesn't need the full spec of the higher level language.

    It would be easy enough to write a macro "in_range(a, x, b)" that would
    do the job.  It is even easier, and more productive, that you simply
    write the "valid_char" function and use it, if that's what you need.

    Yes it would be easier - to provide an ugly, half-assed solution that
    everyone will write a different way (I would use (x, a, b) for example),
    and which can go wrong as soon as someone writes (a, x(), b).

    That's the problem with the macro scheme, it stops the language properly evolving.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blue-Maned_Hawk@21:1/5 to All on Tue Jun 4 11:41:54 2024
    Ignoring the concept of backcompat, operators as a concept are bad enough;
    i think that we need not worsen the matter with new ternary operators.



    --
    Blue-Maned_Hawk│shortens to Hawk│/blu.mɛin.dʰak/│he/him/his/himself/Mr. blue-maned_hawk.srht.site
    INVINCIBLE MOOSE NEXT FIVE KILOMETERS

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mikko@21:1/5 to David Brown on Tue Jun 4 16:11:28 2024
    On 2024-06-04 11:02:03 +0000, David Brown said:

    On 04/06/2024 11:13, Mikko wrote:
    On 2024-06-04 08:58:53 +0000, David Brown said:

    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this >>>> from Python? Namely, the ability to check if a value lies in an interval: >>>>
    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code?� The answer is no, C treats it as the
    expression "(a <= x) <= b".� So you would be changing the meaning of
    existing C code.� I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current
    interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C.� (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

    �return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    or

    �return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.


    Sure, you can always add new things to a language if they would
    previously have been syntax errors or constraint errors. But is there
    a use for it?

    I don't see any need. That c must be mentioned twice for each interval is
    not a problem. If there is a complex expression in place of c it can be computed and stored to a variable before comparison to an interval.

    It is fine if you have a language that has good support for lists,
    sets, ranges, and other higher-level features - then an "in" keyword is
    a great idea. But C is not such a language, and that kind of feature
    would be well outside the scope of the language.

    Or, if one for some reason does it in C anyway, one should have or make
    a library of the essential functions, incuding membership tests.

    It would be easy enough to write a macro "in_range(a, x, b)" that would
    do the job. It is even easier, and more productive, that you simply
    write the "valid_char" function and use it, if that's what you need.

    Indeed.

    --
    Mikko

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to [email protected] on Tue Jun 4 15:17:30 2024
    On Tue, 4 Jun 2024 11:41:54 -0000 (UTC)
    Blue-Maned_Hawk <[email protected]d> wrote:

    operators as a concept are bad enough;


    Those are fighting words! Care to suggest an alternative?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Mikko on Tue Jun 4 15:42:14 2024
    On 04.06.2024 11:13, Mikko wrote:
    On 2024-06-04 08:58:53 +0000, David Brown said:
    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this
    from Python? Namely, the ability to check if a value lies in an
    interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF

    While nice to have it's just syntactic sugar.

    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code? The answer is no, C treats it as the
    expression "(a <= x) <= b". So you would be changing the meaning of
    existing C code. I think it's fair to say there is likely to be very
    little existing correct and working C code that relies on the current
    interpretation of such expressions, but the possibility is enough to
    rule out such a change ever happening in C. (And it would also
    complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

    return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    Introducing a new keyword 'in' would also break a lot of code, even
    more code than the syntactic change ( . <= . <= . ) mentioned above
    in the OP, don't you think?


    or

    return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.

    Not worth the hassle, IMO.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Tue Jun 4 15:24:43 2024
    On 04/06/2024 13:23, bart wrote:
    On 04/06/2024 12:02, David Brown wrote:
    On 04/06/2024 11:13, Mikko wrote:
    On 2024-06-04 08:58:53 +0000, David Brown said:

    On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
    Would it break backward compatibility for C to add a feature like this >>>>> from Python? Namely, the ability to check if a value lies in an
    interval:

    def valid_char(c) :
    "is integer c the code for a valid Unicode character." \
    " This excludes surrogates."
    return \
    (
    0 <= c <= 0x10FFFF
    and
    not (0xD800 <= c < 0xE000)
    )
    #end valid_char

    Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
    without breaking existing code?  The answer is no, C treats it as
    the expression "(a <= x) <= b".  So you would be changing the
    meaning of existing C code.  I think it's fair to say there is
    likely to be very little existing correct and working C code that
    relies on the current interpretation of such expressions, but the
    possibility is enough to rule out such a change ever happening in
    C.  (And it would also complicate the grammar a fair bit.)


    <https://c-faq.com/expr/transitivity.html>

    That does not prevet from doing the same with a different syntax.
    The main difference is that in the current C syntax that cannot be
    said without mentioning c twice. In the example program C would
    require that c is mentioned four times but the shown Python code
    only needs it mentioned twice. An ideal syntax woult only mention
    it once, perhaps

      return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

    or

      return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

    or something like that, preferably so that no new reserved word is
    needed.


    Sure, you can always add new things to a language if they would
    previously have been syntax errors or constraint errors.  But is there
    a use for it?

    It is fine if you have a language that has good support for lists,
    sets, ranges, and other higher-level features - then an "in" keyword
    is a great idea.  But C is not such a language, and that kind of
    feature would be well outside the scope of the language.

    I disagree. I have a script language where 'in' works with all sorts of
    data types, and where ranges like a..b and sets like [a..b, c, d, e] are actual types.

    C is not a script language.


    Yet I also introduced 'in' into my systems language, even though it is
    very restricted:

        if a in b..c then
        if a in [b, c, d] then

    This is limited to integer types. The set construct here doesn't allow
    ranges (it could have done). Neither the range or set is a datatype - it
    just syntax. (I can't do range r := 1..10.)

    Adding such a feature to your own personal language, for your own
    personal use, is easy enough (relative to the rest of the work involved
    in designing your own personal language and making tools for it, which
    is of course no small feat). Adding it to C with its standards,
    existing code, toolchains, additional tools, developers, etc., is a
    whole different kettle of fish.

    I don't think it would be practical to add it to C in a way that is
    simple and restricted enough to be suitable for C, while also being
    useful enough to make it worth the effort.

    Remember, when you add these things to your own language, you have your
    own needs in mind and can ignore everything else, all corner cases, and
    all complications. Putting a feature in C means making decisions like
    figuring out what type the expression "b..c" has, whether the various
    bits and pieces have to be constants or if they can be variables, how
    the operator precedences work, how to treat floating point numbers or
    mixes of different types, and countless other factors. If a language
    already has the concepts, rules and grammar for ranges or lists, adding
    an "in" operator is natural - if not, then it's a huge amount of extra
    junk pulled into the language and syntax for a very minor gain.

    I don't disagree that it could be useful, and I'm sure I'd use it if it
    existed in C, I just disagree that it makes sense in C.



    It is incredibly useful:

       if c in [' ', '\t', '\n'] then ... # whitespace
       if b in 0..255 then
       if b in u8.bounds then             # alternative

    Not to forget:

       if x = y = 0 then                  # both x and y are zero

    It doesn't need the full spec of the higher level language.

    It would be easy enough to write a macro "in_range(a, x, b)" that
    would do the job.  It is even easier, and more productive, that you
    simply write the "valid_char" function and use it, if that's what you
    need.

    Yes it would be easier - to provide an ugly, half-assed solution that

    You and I are British - the term is "half-arsed" :-)

    everyone will write a different way (I would use (x, a, b) for example),
    and which can go wrong as soon as someone writes (a, x(), b).

    That's the problem with the macro scheme, it stops the language properly evolving.


    If it were considered useful enough, it could be standardised in the C
    library. If it is not useful enough to standardise in the library, it
    is certainly not useful enough to put in the language itself.

    In practice, while I would put something like this in a new language, I
    don't think it is important enough to try to add to C. When you need to
    do a lot of checks, you'd put them within a function (or macro if you
    prefer), such as "isspace()".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Tue Jun 4 15:16:57 2024
    On 04/06/2024 14:24, David Brown wrote:
    On 04/06/2024 13:23, bart wrote:
    On 04/06/2024 12:02, David Brown wrote:

    It is fine if you have a language that has good support for lists,
    sets, ranges, and other higher-level features - then an "in" keyword
    is a great idea.  But C is not such a language, and that kind of
    feature would be well outside the scope of the language.

    I disagree. I have a script language where 'in' works with all sorts
    of data types, and where ranges like a..b and sets like [a..b, c, d,
    e] are actual types.

    C is not a script language.


    Yet I also introduced 'in' into my systems language, even though it is
    very restricted:

         if a in b..c then
         if a in [b, c, d] then

    This is limited to integer types. The set construct here doesn't allow
    ranges (it could have done). Neither the range or set is a datatype -
    it just syntax. (I can't do range r := 1..10.)

    Adding such a feature to your own personal language, for your own
    personal use, is easy enough (relative to the rest of the work involved
    in designing your own personal language and making tools for it, which
    is of course no small feat).  Adding it to C with its standards,
    existing code, toolchains, additional tools, developers, etc., is a
    whole different kettle of fish.

    I was responding to your comment:

    "and that kind of feature would be well outside the scope of the language."

    I think it can suit that level of language if you avoid being too ambitious.

    I agree it is not practical to apply to C at this point, not without
    making it ugly or unwieldy enough that people might as well use existing solutions.

    (Such a feature also aids simpler non-optimising compilers. Take these
    examples that all do the same thing:

    if a <= f() and f() <= c then fi

    if a <= f() <= c then fi

    if f() in a..c then fi

    If the two f() calls in the first example were considered common subexpressions, I don't have the means in my compiler to detect that
    that and evaluate them just once.

    In the other two examples, the language lets you express that directly.

    Even for a simpler 'b in a..c' example, it is easier to generate more
    efficient code, and do that more efficiently too than building something
    up only to tear it down again.)


    It would be easy enough to write a macro "in_range(a, x, b)" that
    would do the job.  It is even easier, and more productive, that you
    simply write the "valid_char" function and use it, if that's what you
    need.

    Yes it would be easier - to provide an ugly, half-assed solution that

    You and I are British - the term is "half-arsed" :-)

    I'm catering for a wider readership.

    (Actually I'm not quite considered British enough to be allowed in the
    upcoming election.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Tue Jun 4 17:40:22 2024
    On 04/06/2024 16:16, bart wrote:
    On 04/06/2024 14:24, David Brown wrote:
    On 04/06/2024 13:23, bart wrote:
    On 04/06/2024 12:02, David Brown wrote:

    It is fine if you have a language that has good support for lists,
    sets, ranges, and other higher-level features - then an "in" keyword
    is a great idea.  But C is not such a language, and that kind of
    feature would be well outside the scope of the language.

    I disagree. I have a script language where 'in' works with all sorts
    of data types, and where ranges like a..b and sets like [a..b, c, d,
    e] are actual types.

    C is not a script language.


    Yet I also introduced 'in' into my systems language, even though it
    is very restricted:

         if a in b..c then
         if a in [b, c, d] then

    This is limited to integer types. The set construct here doesn't
    allow ranges (it could have done). Neither the range or set is a
    datatype - it just syntax. (I can't do range r := 1..10.)

    Adding such a feature to your own personal language, for your own
    personal use, is easy enough (relative to the rest of the work
    involved in designing your own personal language and making tools for
    it, which is of course no small feat).  Adding it to C with its
    standards, existing code, toolchains, additional tools, developers,
    etc., is a whole different kettle of fish.

    I was responding to your comment:

    "and that kind of feature would be well outside the scope of the language."

    I think it can suit that level of language if you avoid being too
    ambitious.


    It might be that we would agree on that if we worked hard enough to find
    a common definition for "that level of language". But I think that
    would be a lot of time and effort for little purpose. I do agree that
    with enough limitation in the scope of the feature, it is less
    unreasonable for a low-level language. But I think I would want to
    limit the scope until there is little point in the "in" operator - or I
    would want to go the other direction and define something like Pascal's
    sets with many more operators and uses.

    I agree it is not practical to apply to C at this point, not without
    making it ugly or unwieldy enough that people might as well use existing solutions.

    Yes.


    (Such a feature also aids simpler non-optimising compilers. Take these examples that all do the same thing:

        if a <= f() and f() <= c then fi

        if a <= f() <= c then fi

        if f() in a..c then fi

    If the two f() calls in the first example were considered common subexpressions, I don't have the means in my compiler to detect that
    that and evaluate them just once.


    I see your point, but I rate the design and use of a language as /much/
    more important than the ease of implementation. I realise the balance
    is a bit different when the user is the implementer.

    In the other two examples, the language lets you express that directly.

    Even for a simpler 'b in a..c' example, it is easier to generate more efficient code, and do that more efficiently too than building something
    up only to tear it down again.)


    It would be easy enough to write a macro "in_range(a, x, b)" that
    would do the job.  It is even easier, and more productive, that you
    simply write the "valid_char" function and use it, if that's what
    you need.

    Yes it would be easier - to provide an ugly, half-assed solution that

    You and I are British - the term is "half-arsed" :-)

    I'm catering for a wider readership.

    We can educate them!


    (Actually I'm not quite considered British enough to be allowed in the upcoming election.)


    I can't vote either, but that's because I don't live in the UK. And
    given the state of UK politics these days, I'm happy to be out of it.
    For quite a while, the Scottish Parliament were looking like the adults
    in the room, but they've managed to mess things up for themselves too.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Tue Jun 4 16:58:43 2024
    On 04/06/2024 16:27, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 04/06/2024 13:23, bart wrote:

    It is incredibly useful:

       if c in [' ', '\t', '\n'] then ... # whitespace

    if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

    That doesn't do the same thing. In my example, c is a character, not a
    string.

    To achieve the same thing using strpbrk requires code like this:

    char c[2];

    c[0]=rand()&255; // Create a string
    c[1]=0;

    if (strpbrk(c, " \t\n") != NULL) puts("whitespace");

    If I compile this with gcc -O3, then the checking part is this:

    lea rcx, 46[rsp]
    mov BYTE PTR 47[rsp], 0
    lea rdx, .LC0[rip]
    mov BYTE PTR 46[rsp], al
    call strpbrk // CALL TO LIBRARY FUNCTION
    test rax, rax
    je .L2
    lea rcx, .LC1[rip]
    call puts

    I don't know what it gets up to inside strprbk. If I write this in my
    language:

    if c in [9,10,32] then
    puts("whitespace")
    fi

    The generated code is this (using alternate register names, D0 = rax):

    mov D0, D3 # (could have tested D3 (= c) directly.)
    cmp D0, 9
    jz L4
    cmp D0, 10
    jz L4
    cmp D0, 32
    jnz L3
    L4:
    lea D10, [L5]
    call puts*
    L3:

    Anyway, the construct is not limited to character codes that can be
    contained within a string. It works for 64-bit values which can include
    0. And it could be extended to other scalar types like floats and pointers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Tue Jun 4 15:27:31 2024
    David Brown <[email protected]> writes:
    On 04/06/2024 13:23, bart wrote:

    It is incredibly useful:

       if c in [' ', '\t', '\n'] then ... # whitespace

    if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.


    If it were considered useful enough, it could be standardised in the C >library. If it is not useful enough to standardise in the library, it
    is certainly not useful enough to put in the language itself.

    indeed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Michael S on Tue Jun 4 17:54:33 2024
    On 04/06/2024 17:25, Michael S wrote:
    On Tue, 4 Jun 2024 16:58:43 +0100
    bart <[email protected]> wrote:

    On 04/06/2024 16:27, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 04/06/2024 13:23, bart wrote:

    It is incredibly useful:

       if c in [' ', '\t', '\n'] then ... # whitespace

    if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

    That doesn't do the same thing. In my example, c is a character, not
    a string.



    Will that be be better?
    if (memchr(" \t\n", c, 3) != NULL)


    It's a better match. But on gcc-O3, it still calls the library function
    up to version 12.x. After that, it's smart enough to generate similar
    code to my non-optimising compiler using the built-in feature.

    It is also still limited to byte values.

    My approach is to fix a language, which is easier, than to expend
    magnitudes more effort in elaborate tools and ultra-smart compilers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to bart on Tue Jun 4 19:25:47 2024
    On Tue, 4 Jun 2024 16:58:43 +0100
    bart <[email protected]> wrote:

    On 04/06/2024 16:27, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 04/06/2024 13:23, bart wrote:

    It is incredibly useful:

       if c in [' ', '\t', '\n'] then ... # whitespace

    if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

    That doesn't do the same thing. In my example, c is a character, not
    a string.



    Will that be be better?
    if (memchr(" \t\n", c, 3) != NULL)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Tue Jun 4 23:12:07 2024
    On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

    i think that we need not worsen the matter with new ternary operators.

    These are not ternary operators.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Wed Jun 5 00:22:36 2024
    On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:
    On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

    i think that we need not worsen the matter with new ternary operators.

    These are not ternary operators.

    So what are they?

    I've implemented them several times, and found they really need to be
    treated as a special kind of n-ary opterator.

    An AST node for A+B say would have single operator "+", and two operands.

    One for "A<=B<=C" would have two operators "<=" and "<", and three operands.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Wed Jun 5 01:30:57 2024
    On Wed, 5 Jun 2024 00:22:36 +0100, bart wrote:

    On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:

    On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

    i think that we need not worsen the matter with new ternary operators.

    These are not ternary operators.

    So what are they?

    A special case in the syntax rules for the comparison operators <https://docs.python.org/3/reference/expressions.html#comparisons>.

    I've implemented them several times, and found they really need to be
    treated as a special kind of n-ary opterator.

    Remember, Python allows users to define custom overloads for the standard operators. For comparisons, these functions always take two operands, and
    the compiler takes care of invoking them correctly to handle interval comparisons.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Wed Jun 5 03:29:18 2024
    On Tue, 4 Jun 2024 12:23:15 +0100, bart wrote:

    That's the problem with the macro scheme, it stops the language properly evolving.

    The problem is the way C does macros. Other languages with powerful macros (*cough* Lisp *cough*) aren’t stopped from evolving; quite the opposite.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Thu Jun 6 19:48:42 2024
    On 05/06/2024 02:30, Lawrence D'Oliveiro wrote:
    On Wed, 5 Jun 2024 00:22:36 +0100, bart wrote:

    On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:

    On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

    i think that we need not worsen the matter with new ternary operators.

    These are not ternary operators.

    So what are they?

    A special case in the syntax rules for the comparison operators <https://docs.python.org/3/reference/expressions.html#comparisons>.

    I've implemented them several times, and found they really need to be
    treated as a special kind of n-ary opterator.

    Remember, Python allows users to define custom overloads for the standard operators. For comparisons, these functions always take two operands, and
    the compiler takes care of invoking them correctly to handle interval comparisons.


    Well, for these 3 lines in my scripting language:


    if a = b then end # universal
    if a = b < c then end # chained (like Python, unlike C)
    if (a = b) < c then end # emulate C behaviour

    These are the ASTs produced (2: is the empty True branch; 3: would be
    for the 'else' branch, not present here):

    - 1 if:
    - - 1 eq:
    - - - 1 name: a
    - - - 2 name: b
    - - 2 block:

    - 1 if:
    - - 1 cmpchain: eq lt
    - - - 1 name: a
    - - - 1 name: b
    - - - 1 name: c
    - - 2 block:

    - 1 if:
    - - 1 lt:
    - - - 1 eq:
    - - - - 1 name: a
    - - - - 2 name: b
    - - - 2 name: c
    - - 2 block:

    Notice the middle one is one linear group with N operands and N-1
    comparisons.

    No operator overloads are allowed, but if they were, it would still
    work, but a comparison operator would be required to return True or
    False from its two operands. It would be unwise for it to return a
    string for example.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Thu Jun 6 22:54:38 2024
    On Thu, 6 Jun 2024 19:48:42 +0100, bart wrote:

    if a = b then end # universal if a = b < c then end

    Really?? You allow that?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Fri Jun 7 01:52:34 2024
    On 06/06/2024 23:54, Lawrence D'Oliveiro wrote:
    On Thu, 6 Jun 2024 19:48:42 +0100, bart wrote:

    if a = b then end # universal if a = b < c then end

    Really?? You allow that?

    Which bit: The 'a=b' (that's equality); the 'a=b<c' (just like Python);
    or 'then end' (just an empty block)?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Fri Jun 7 02:17:22 2024
    On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:

    The 'a=b' (that's equality)

    Not in C it isn’t.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Fri Jun 7 04:25:17 2024
    On Thu, 06 Jun 2024 20:53:41 -0700, Keith Thompson wrote:

    Lawrence D'Oliveiro <[email protected]d> writes:
    On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:
    The 'a=b' (that's equality)

    Not in C it isn’t.

    Of course not. You snipped the part where Bart very clearly said that
    he was talking about his own scripting language.

    Oh, sorry.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Fri Jun 7 11:22:12 2024
    On 07/06/2024 05:53, Keith Thompson wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:
    The 'a=b' (that's equality)

    Not in C it isn’t.

    Of course not. You snipped the part where Bart very clearly said that
    he was talking about his own scripting language. And we're talking
    about a proposed new C feature, so I have no problem with references to
    other languages.

    And there's precedent in other languages (Python, as Bart already
    pointed out) for `a == b < c` being equivalent to `a == b && b < c`,
    but with b evaluated only once.

    *If* C were to adopt chained comparisons, I would have no problem
    with `a == b < c` being supported with the obvious meaning.
    I dislike arbitrary restrictions. (Though the fact that == and
    < have different precedences would have to be resolved somehow.)
    In principle it could quietly change the behavior of existing code,
    but it's likely that most such code was already wrong. I don't
    advocate making such a change, and I don't think it's likely to
    happen, I wouldn't object to it.


    While it is true that such an addition to C would be very unlikely to
    break existing code (any current code that uses "a == b < c" or "a < b <
    c" is probably incorrect), there is a potentially serious consequence
    that has not been considered here.

    Suppose C26 allows "a < b < c" and "a == b < c" chains, like Python, and
    some people start using it in real code. You are going to get two
    effects. One is that some people will read that new code but not know
    the new interpretation. They will think the code parses as "a == (b <
    c)", and is likely a mistake, or does something different from what it
    now actually does.

    The other is that some people will get used to it and think this is how
    C treats chained operators. The code or similar expressions will end up
    in C code that is compiled under different standards. Old C standards
    are used all the time - there are still some people who seem to think
    new coding in C89/C90 is a /feature/, rather than historical
    re-enactment. You would get code that is tested and correct in C26 used incorrectly as C23 or older.

    There is also the C++ compatibility question. C++ provides flexible
    operator overloading combined with a poverty of available operators, so
    the relational operators <, >, <= and >= are sometimes used for
    completely different purposes, similar to the >> and << operators.
    Chaining relational operators would complicate this significantly, I
    think. If C++ adopted the feature it would be a mess to support
    overloading, and if they did not, "a < b < c" in C and C++ would be
    valid but with completely different semantics. Neither option is good.

    To me, this possibility, along with the confusion it would cause,
    totally outweighs any potential convenience of chained comparisons. I
    have never found them helpful in Python coding, and I can't imagine them
    being of any interest in my C code.

    Even in a new language, I would not want to see chained relational
    operators unless you had a strict requirement that relational operators evaluate to a boolean type with no support for relational operators
    between booleans, and no support for comparison or other operators
    between booleans and other types. And even then, what is "a == b == c" supposed to mean, or "a != b != c" ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Jun 7 10:42:03 2024
    On Fri, 7 Jun 2024 11:22:12 +0200, David Brown wrote:

    C++ provides flexible
    operator overloading combined with a poverty of available operators, so
    the relational operators <, >, <= and >= are sometimes used for
    completely different purposes, similar to the >> and << operators.
    Chaining relational operators would complicate this significantly, I
    think.

    Python offers just as flexible operator overloading, and it can handle
    this feature without any trouble.

    Of course, Python has only a fraction of the complexity of C++. It could
    be that an unexpected interaction with some other weird C++ feature could
    cause problems.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Fri Jun 7 10:45:05 2024
    On Fri, 7 Jun 2024 11:28:23 +0100, bart wrote:

    However, 'a > b <= c' is not clear.

    One thing it would have been handy to have is some way of saying

    b < a or b > c

    in a chained comparison somehow. In other words, “not (a < b < c)” without the negation and need for the parentheses (and correspondingly for the
    obvious usages of “<=” and “>=”).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Fri Jun 7 11:28:23 2024
    On 07/06/2024 10:22, David Brown wrote:

    To me, this possibility, along with the confusion it would cause,
    totally outweighs any potential convenience of chained comparisons.

    Even in a new language, I would not want to see chained relational
    operators unless you had a strict requirement that relational operators evaluate to a boolean type with no support for relational operators
    between booleans, and no support for comparison or other operators
    between booleans and other types.


    And even then, what is "a == b == c"
    supposed to mean, or "a != b != c" ?

    Some combinations are confusing, and in my languages I would suggest
    they are avoided (but stop short of banning them).

    Ones like 'a == b == c' are straightforward: you're testing that all a,
    b, c have the same value.

    With relational ops like 'a < b <= c', that means:

    a < b' && b' << c

    with b' representing the same copy of b (which can be an arbitrary term)
    should be used.

    However, 'a > b <= c' is not clear. While the above indicate a
    relationship between a and c when the whole expression is True, here you
    can't deduce any such relationship; all of these could be True:

    a > c, a == c, a < c

    And there was one more I'd forgotten about:

    a != b != c

    This looks very much like the exact opposite of ' a == b == c', but it
    isn't! (I think it would need a != b != c != a).

    So the restrictions I would suggest are:

    * Not mixing <, <= with >, >= in the same chain (any angle brackets
    should point the same way)

    * Not allowing != in the chain.


    Such a chain also requires that all 6 (or 5) operators have the same precedence, as you can't have 'a = b <= c' mean 'a = (b <= c)'.


    I
    have never found them helpful in Python coding, and I can't imagine them being of any interest in my C code.

    The most common uses for me are comparing that N terms are equal (here =
    means equality):

    if x.tag = y.tag = ti64
    if a = b = 0

    I also used them for range-checking:

    if a <= b <= c

    until I introduced 'if b in a..c' for that purpose. (However I would
    still need 'a <= b <= c' for floating point.)

    What I might then suggest for C, as a first step, is to allow only
    chained '==' comparisons. That avoids those ambiguous cases, and also
    the problem with mixed precedence, while still providing a handy new
    feature.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Fri Jun 7 13:04:34 2024
    On 07/06/2024 11:55, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 07/06/2024 05:53, Keith Thompson wrote:
    [...]


    There is also the C++ compatibility question. C++ provides flexible
    operator overloading combined with a poverty of available operators,
    so the relational operators <, >, <= and >= are sometimes used for
    completely different purposes, similar to the >> and <<
    operators. Chaining relational operators would complicate this
    significantly, I think. If C++ adopted the feature it would be a mess
    to support overloading, and if they did not, "a < b < c" in C and C++
    would be valid but with completely different semantics. Neither
    option is good.

    I mentioned earlier that someone did a study of open source C++ code and found no occurrences of things like "a < b < c", except for 4 cases that
    were intended to be chained but would behave incorrectly. I presume
    that this study would have covered overloaded operators.


    You helpfully quoted from that study, and it included :

    - Many instances of using successive comparison operators in DSLs that
    overloaded these operators to give meaning unrelated to comparisons.

    So yes, it seems that overloading the relational operators for other
    purposes and then chaining them is a real thing.

    To me, this possibility, along with the confusion it would cause,
    totally outweighs any potential convenience of chained comparisons. I
    have never found them helpful in Python coding, and I can't imagine
    them being of any interest in my C code.

    I agree. I wouldn't mind being able to use the feature, and I think
    I've actually used it in Python, but its lack isn't a big problem.

    Even in a new language, I would not want to see chained relational
    operators unless you had a strict requirement that relational
    operators evaluate to a boolean type with no support for relational
    operators between booleans, and no support for comparison or other
    operators between booleans and other types.

    In Python, all comparison operators (<, >, ==, >=, <=, !=, is, is not,
    in, not in) have the same precedence, and chained operations are
    specified straightforwardly. They evaluate to a bool result. Boolean
    values can be compared (False < True), which doesn't seem to cause any problems.

    https://docs.python.org/3/reference/expressions.html#comparisons

    And even then, what is "a
    == b == c" supposed to mean, or "a != b != c" ?

    "a == b && b == c", and "a != b && b != c", respectively, except that b
    is only evaluated once.


    If "c" is a boolean, some might think the "natural" interpretation of "a
    == b == c" is "(a == b) == c" - it is the current semantics in C. Some
    people may think that "a != b != c" should be interpreted as "(a != b) &
    (b != c) & (a != c)".

    It's one thing to make a rigid definition of the meaning in a language,
    picking a consistent set of rules of precedence and syntax. It is
    another thing to make sure it matches up with the interpretations people
    have from normal mathematics, natural language, and other programming languages. When there is a mismatch, you need good reasons to accept
    the syntax as a good language design idea - the more likely the misunderstanding, the stronger reasons you need.

    To me, the potential misunderstandings of including != in chains is far
    too high in comparison to the meagre benefits. The use of == could be
    clear in some situations (combined with strong type checking to help
    catch mistakes) but not others. I could see a chain of a mix of < and
    <= making sense, or of > and >=, and occasionally being useful. I don't
    think there is a point in allowing more than that.

    After all, if all you need is to avoid evaluating "b" more than once,
    you can just do:

    auto const b_ = b;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Lawrence D'Oliveiro on Fri Jun 7 14:51:09 2024
    On Fri, 7 Jun 2024 10:45:05 -0000 (UTC)
    Lawrence D'Oliveiro <[email protected]d> wrote:

    On Fri, 7 Jun 2024 11:28:23 +0100, bart wrote:

    However, 'a > b <= c' is not clear.

    One thing it would have been handy to have is some way of saying

    b < a or b > c

    in a chained comparison somehow. In other words, “not (a < b < c)” without the negation and need for the parentheses (and
    correspondingly for the obvious usages of “<=” and “>=”).

    "Out of range" expressed as negation of "in range" looks to me as least confusing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Fri Jun 7 13:17:50 2024
    On 07/06/2024 12:28, bart wrote:
    On 07/06/2024 10:22, David Brown wrote:



    So the restrictions I would suggest are:

    * Not mixing <, <= with >, >= in the same chain (any angle brackets
      should point the same way)

    * Not allowing != in the chain.


    I think these are good ideas.


    Such a chain also requires that all 6 (or 5) operators have the same precedence, as you can't have 'a = b <= c' mean 'a = (b <= c)'.


      I
    have never found them helpful in Python coding, and I can't imagine them being of any interest in my C code.

    The most common uses for me are comparing that N terms are equal (here = means equality):

      if x.tag = y.tag = ti64
      if a = b = 0


    These do not correspond to what you want to say.

    If someone has balloons, and you want to check that they have 2 red
    balloons and two yellow balloons, you do start off by checking if the
    number of red balloons is the same as the number of yellow balloons, and
    then that the number of yellow balloons is 2.

    Code syntax should, where practical, reflect its purpose and intent.
    You should therefore write (adjusting to C, Python, or your own syntax
    as desired) :

    if red_balloons == 2 and yellow_balloons == 2 ...

    If you don't want to write the "magic number" 2 twice, you give it a name :

    expected_balloons = 2
    if red_balloons == expected_balloons
    and yellow_balloons == expected_balloons...


    I also used them for range-checking:

      if a <= b <= c

    until I introduced 'if b in a..c' for that purpose. (However I would
    still need 'a <= b <= c' for floating point.)

    Using "in" and a range or interval syntax would usually be clearer, and
    closer to the intended meaning, IMHO.

    Both of these chained shortcuts are used in mathematics, so they are not unfamiliar. But in writing mathematics, compared to programming, you
    have a lot more freedom to expect the reader to interpret things
    sensibly, and vastly more freedom in layout while also having an
    incentive to keep things compact. Good or common mathematical syntax
    does not necessarily translate directly to good programming syntax.


    What I might then suggest for C, as a first step, is to allow only
    chained '==' comparisons. That avoids those ambiguous cases, and also
    the problem with mixed precedence, while still providing a handy new
    feature.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Fri Jun 7 13:20:33 2024
    On 07/06/2024 12:17, David Brown wrote:
    On 07/06/2024 12:28, bart wrote:
    On 07/06/2024 10:22, David Brown wrote:

    ;  I
    have never found them helpful in Python coding, and I can't imagine
    them
    being of any interest in my C code.

    The most common uses for me are comparing that N terms are equal (here
    = means equality):

       if x.tag = y.tag = ti64
       if a = b = 0


    These do not correspond to what you want to say.

    If someone has balloons, and you want to check that they have 2 red
    balloons and two yellow balloons, you do start off by checking if the
    number of red balloons is the same as the number of yellow balloons, and
    then that the number of yellow balloons is 2.

    ("you don't"?)

    That's not quite the intent of my examples, which is:

    (1) That x.tag/y.tag or a/b are equal to each other

    (2) That they also have a particular value


    Code syntax should, where practical, reflect its purpose and intent. You should therefore write (adjusting to C, Python, or your own syntax as desired) :

        if red_balloons == 2 and yellow_balloons == 2 ...

    Here that connection is lost. You might infer it, but you don't know
    whether, at some point, the program could be changed to require 3 red
    balloons, while still needing 2 yellow ones.

    Or someone could just write 3 by mistake. By repeating such terms, there
    is more opportunity for mistakes, and the connection between terms is
    looser.

    Here is another real example, first written in static code (not in C;
    I've shortened 'sample' to avoid line-wrap):

    if hdr.hsamp[2] = hdr.vsamp[2] = hdr.hsamp[3] = hdr.vsamp[3] and
    (hdr.hsamp[1] <= 2 and hdr.vsamp[1] <= 2) then

    pimage := loadcolour(fs, hdr.hsamp[1], hdr.vsamp[1])

    and here in dynamic code:

    (vsample1, vsample2, vsample3) := hdr.vsample
    (hsample1, hsample2, hsample3) := hdr.hsample
    ...
    if hsample2 = vsample2 = hsample3 = vsample3 and
    (hsample1 <= 2 and vsample1 <=2 ) then
    ....

    Both check that 4 terms are identical, but there is no specific value
    they have to be.

    (It would have been nice to avoid that repetition of '2' in that second
    test, but that's more difficult to achieve. There, hsample1/vsample1
    don't need to have the same value.

    It's expressible as something like 'reduce(and, map(<=, (a, b), 2)' but
    that's using a sledgehammer just to avoid that duplicate '2'. It's not
    suited to lower-level code either.)










    If you don't want to write the "magic number" 2 twice, you give it a name :

        expected_balloons = 2
        if red_balloons == expected_balloons
            and yellow_balloons == expected_balloons...


    I also used them for range-checking:

       if a <= b <= c

    until I introduced 'if b in a..c' for that purpose. (However I would
    still need 'a <= b <= c' for floating point.)

    Using "in" and a range or interval syntax would usually be clearer, and closer to the intended meaning, IMHO.

    Both of these chained shortcuts are used in mathematics, so they are not unfamiliar.  But in writing mathematics, compared to programming, you
    have a lot more freedom to expect the reader to interpret things
    sensibly, and vastly more freedom in layout while also having an
    incentive to keep things compact.  Good or common mathematical syntax
    does not necessarily translate directly to good programming syntax.


    What I might then suggest for C, as a first step, is to allow only
    chained '==' comparisons. That avoids those ambiguous cases, and also
    the problem with mixed precedence, while still providing a handy new
    feature.





    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Fri Jun 7 14:00:30 2024
    David Brown <[email protected]> writes:
    On 07/06/2024 12:28, bart wrote:
    On 07/06/2024 10:22, David Brown wrote:



    So the restrictions I would suggest are:

    * Not mixing <, <= with >, >= in the same chain (any angle brackets
      should point the same way)

    * Not allowing != in the chain.


    I think these are good ideas.

    The most common idiom that I would find useful for chained
    comparisons is range checking:

    if ((address >= base) && (address < limit)) ...

    I'd love a new 'in' operator:

    if (address in [base, limit)] ...

    Otherwise I agree with you and Keith that a general chained
    comparison syntax will problematic.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Sat Jun 8 17:42:19 2024
    On 07/06/2024 20:57, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 07/06/2024 11:55, Keith Thompson wrote:
    David Brown <[email protected]> writes:
    On 07/06/2024 05:53, Keith Thompson wrote:
    [...]

    If "c" is a boolean, some might think the "natural" interpretation of
    "a == b == c" is "(a == b) == c" - it is the current semantics in C.
    Some people may think that "a != b != c" should be interpreted as "(a
    != b) & (b != c) & (a != c)".

    Yes, some people might be wrong.

    While you should expect people reading code - and certainly those
    writing it - to be fairly familiar with the programming language in
    question, not everyone is an expert. So when designing a language or considering its features, you have to look at what a significant
    fraction of users might get wrong. I don't know how often people might
    get these cases wrong, but I think it has enough potential that I'd be
    very wary of allowing it - at least as an addition to C.


    It's one thing to make a rigid definition of the meaning in a
    language, picking a consistent set of rules of precedence and syntax.
    It is another thing to make sure it matches up with the
    interpretations people have from normal mathematics, natural language,
    and other programming languages. When there is a mismatch, you need
    good reasons to accept the syntax as a good language design idea - the
    more likely the misunderstanding, the stronger reasons you need.

    To me, the potential misunderstandings of including != in chains is
    far too high in comparison to the meagre benefits. The use of ==
    could be clear in some situations (combined with strong type checking
    to help catch mistakes) but not others. I could see a chain of a mix
    of < and <= making sense, or of > and >=, and occasionally being
    useful. I don't think there is a point in allowing more than that.

    After all, if all you need is to avoid evaluating "b" more than once,
    you can just do:

    auto const b_ = b;

    There are two separate issues here.

    One is adding chained comparisons to C. We both agree that this is impractical because it would silently change the meaning of valid code.

    Yes.

    (Changing the meaning of old code isn't likely to be much of an issue,
    but any new code using the feature would quietly change behavior when compiled under older C standards or when ported to C++.)

    The other (arguably off-topic) is providing chained comparisons in other languages.

    I agree that this is a different matter, and for languages that don't
    have the level of established history and practice that C does, it could
    be a lot less problematic to add chained relational operators.

    Python does this well, in my opinion. All comparison
    operators have the same precedence, and the semantics of chained
    comparisons are defined straightforwardly. There are no arbitrary restrictions, so you can write things that some people might find ugly
    or confusing (if you have a language that bans ugly code, I'd like to
    see it). The meaning of `a =< b > c` or `a != b == c` is perfectly clear once you understand the rules, and it doesn't change if any of the
    operands are of type bool. `a != b != c` *doesn't* mean
    `a != b and a != c and b != c`. (If you want to test whether all three
    are unequal to each other, you can write `a != b != c != a`, though that evalutes `a` twice.)


    Fair enough.

    I /do/ find the chained relational operators (especially mixes of
    operators) ugly - or at least, I have not had occasion to write Python
    code myself where I thought a chain would look clearer than alternatives.

    But "ugly" is obviously highly subjective. The only way to ban ugly
    code is to do as Bart has done - develop your own language and tools for
    your own personal use, and never have to consider any one else's
    opinions or preferences!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Sun Jun 9 13:26:03 2024
    On 07/06/2024 14:20, bart wrote:
    On 07/06/2024 12:17, David Brown wrote:
    On 07/06/2024 12:28, bart wrote:
    On 07/06/2024 10:22, David Brown wrote:

    ;  I
    have never found them helpful in Python coding, and I can't
    imagine them
    being of any interest in my C code.

    The most common uses for me are comparing that N terms are equal
    (here = means equality):

       if x.tag = y.tag = ti64
       if a = b = 0


    These do not correspond to what you want to say.

    If someone has balloons, and you want to check that they have 2 red
    balloons and two yellow balloons, you do start off by checking if the
    number of red balloons is the same as the number of yellow balloons,
    and then that the number of yellow balloons is 2.

    ("you don't"?)

    Yes, sorry about that!


    That's not quite the intent of my examples, which is:

    (1) That x.tag/y.tag or a/b are equal to each other

    (2) That they also have a particular value


    If that is your intent, then fair enough. But I think that is an
    unusual intent. Of course, I have no idea what "x.tag" or "ti64"
    represent, and if I had, the code might have made more sense to me.

    At least for the second example, if you did not have chaining equality,
    I think you would have preferred to write :

    if (a = 0) and (b = 0) ...

    rather than

    if (a = b) and (b = 0) ...

    If that is indeed the case, then IMHO the chaining version is less clear.


    Code syntax should, where practical, reflect its purpose and intent.
    You should therefore write (adjusting to C, Python, or your own syntax
    as desired) :

         if red_balloons == 2 and yellow_balloons == 2 ...

    Here that connection is lost. You might infer it, but you don't know
    whether, at some point, the program could be changed to require 3 red balloons, while still needing 2 yellow ones.

    Or someone could just write 3 by mistake. By repeating such terms, there
    is more opportunity for mistakes, and the connection between terms is
    looser.

    Sure. If this is a risk, use a separate constant (with appropriate
    name) for the numbers you want, and write that name rather than the
    number 2.


    Here is another real example, first written in static code (not in C;
    I've shortened 'sample' to avoid line-wrap):

        if hdr.hsamp[2] = hdr.vsamp[2] = hdr.hsamp[3] = hdr.vsamp[3] and
                (hdr.hsamp[1] <= 2 and hdr.vsamp[1] <= 2) then

            pimage := loadcolour(fs, hdr.hsamp[1], hdr.vsamp[1])

    and here in dynamic code:

        (vsample1, vsample2, vsample3) := hdr.vsample
        (hsample1, hsample2, hsample3) := hdr.hsample
        ...
        if hsample2 = vsample2 = hsample3 = vsample3 and
                (hsample1 <= 2 and vsample1 <=2 ) then
            ....

    Both check that 4 terms are identical, but there is no specific value
    they have to be.

    (It would have been nice to avoid that repetition of '2' in that second
    test, but that's more difficult to achieve. There, hsample1/vsample1
    don't need to have the same value.

    It's expressible as something like 'reduce(and, map(<=, (a, b), 2)' but that's using a sledgehammer just to avoid that duplicate '2'. It's not
    suited to lower-level code either.)



    There comes a point when a function called "all_equal", or "rising", is
    the right choice. How those functions might be implemented is a matter
    of language, and also whether you want them to work with a variable
    parameter list or over arrays, lists, slices, or whatever the language supports. High-level functions like map and reduce could be part of
    this, along with folds. For example, in C++ 17 (which supports a fold
    syntax), you might write :

    template <typename H1>
    constexpr bool rising(H1)
    {
    return true;
    }

    template <typename H1, typename H2, typename ... T>
    constexpr bool rising(H1 head1, H2 head2, T... tail)
    {
    return head1 <= head2 && rising(head2, tail...);
    }


    Then

    rising(a, b, c, d)

    is interpreted as

    a <= b && rising(b, c, d)

    and so on.


    I don't think it is fair to claim particular ways of writing these
    things are always clearer, or better, or uglier, or unclear - it will
    depend on the rest of the language, and how the code is used. But in
    general I think it helps to write code that follows the logic of what
    the writer really means, rather than alternative constructions that give
    the same result.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Mon Jun 10 16:33:12 2024
    On 09/06/2024 12:26, David Brown wrote:
    On 07/06/2024 14:20, bart wrote:
    On 07/06/2024 12:17, David Brown wrote:
    On 07/06/2024 12:28, bart wrote:
    On 07/06/2024 10:22, David Brown wrote:

    ;  I
    have never found them helpful in Python coding, and I can't
    imagine them
    being of any interest in my C code.

    The most common uses for me are comparing that N terms are equal
    (here = means equality):

       if x.tag = y.tag = ti64
       if a = b = 0


    These do not correspond to what you want to say.

    If someone has balloons, and you want to check that they have 2 red
    balloons and two yellow balloons, you do start off by checking if the
    number of red balloons is the same as the number of yellow balloons,
    and then that the number of yellow balloons is 2.

    ("you don't"?)

    Yes, sorry about that!


    That's not quite the intent of my examples, which is:

    (1) That x.tag/y.tag or a/b are equal to each other

    (2) That they also have a particular value


    If that is your intent, then fair enough.  But I think that is an
    unusual intent.

    Really, checking that A and B both have the same value X is that unusual?

    Of course, I have no idea what "x.tag" or "ti64"
    represent,

    Here .tag contains type info, and it is checking that objects x and y
    both have type i64.

    and if I had, the code might have made more sense to me.

    At least for the second example, if you did not have chaining equality,
    I think you would have preferred to write :

        if (a = 0) and (b = 0) ...

    rather than

        if (a = b) and (b = 0) ...

    I think both alternates are fine if you did not have the feature being discussed. Although you might prefer the first if b was a more elaborate expression than is shown here.

    It does put the onus on the compiler to ensure that repeated terms
    (there are four rather than three) is coded efficiently.

    Not that the chained version can also be expressed as 'if a = 0 = b' for
    a 3-way comparison.

    If that is indeed the case, then IMHO the chaining version is less clear.


    Code syntax should, where practical, reflect its purpose and intent.
    You should therefore write (adjusting to C, Python, or your own
    syntax as desired) :

         if red_balloons == 2 and yellow_balloons == 2 ...

    Here that connection is lost. You might infer it, but you don't know
    whether, at some point, the program could be changed to require 3 red
    balloons, while still needing 2 yellow ones.

    Or someone could just write 3 by mistake. By repeating such terms,
    there is more opportunity for mistakes, and the connection between
    terms is looser.

    Sure.  If this is a risk, use a separate constant (with appropriate
    name) for the numbers you want, and write that name rather than the
    number 2.

    If you have several such names, you can still write the wrong one!

    Where you have a relationship that naturally involves 3 terms, and you
    express it with 4, then one of them needs to be repeated. But the
    language cannot enforce that extra term being one of the first three.

    The programmer also now has to deal with a repeated time that might have side-effects.



    There comes a point when a function called "all_equal", or "rising", is
    the right choice.  How those functions might be implemented is a matter
    of language, and also whether you want them to work with a variable
    parameter list or over arrays, lists, slices, or whatever the language supports.  High-level functions like map and reduce could be part of
    this, along with folds.  For example, in C++ 17 (which supports a fold syntax), you might write :

        template <typename H1>
        constexpr bool rising(H1)
        {
            return true;
        }

        template <typename H1, typename H2, typename ... T>
        constexpr bool rising(H1 head1, H2 head2, T... tail)
        {
            return head1 <= head2 && rising(head2, tail...);
        }


    Then

        rising(a, b, c, d)

    is interpreted as

        a <= b && rising(b, c, d)

    and so on.


    I don't think it is fair to claim particular ways of writing these
    things are always clearer, or better, or uglier, or unclear - it will
    depend on the rest of the language, and how the code is used.  But in general I think it helps to write code that follows the logic of what
    the writer really means, rather than alternative constructions that give
    the same result.

    Using function-like syntax is OK when you have the same operator between multiple terms. 'rising' could have '<' or '<='.

    All-equal would have the same operator too, but it looks clunkier, and a
    bit over-the-top:

    a = b 2 terms
    all_equal(a, b, c) 3 terms using your feature
    a = b = c 3 terms using chained ops

    rising(a, b, c) Using your other feature
    a <= b <= c Using the same chained-op feature

    Your solution requires a heavyweight language feature. It also looks
    like it will generate a lot of intermediate code that will need a
    heavyweight optimiser to tear down again.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Mon Jun 10 17:56:19 2024
    On 10/06/2024 17:33, bart wrote:
    On 09/06/2024 12:26, David Brown wrote:
    On 07/06/2024 14:20, bart wrote:
    On 07/06/2024 12:17, David Brown wrote:


    If that is your intent, then fair enough.  But I think that is an
    unusual intent.

    Really,  checking that A and B both have the same value X is that unusual?

    No. But checking that A and B have the same value, then checking that
    one of them as the same value as a constant X, is - I would say -
    definitely an unusual way to think about things. It is more natural to
    check if A is equal to X, and if B is equal to X.

    You seem to disagree with that. Fair enough, it is a subjective opinion.


    I don't think it is fair to claim particular ways of writing these
    things are always clearer, or better, or uglier, or unclear - it will
    depend on the rest of the language, and how the code is used.  But in
    general I think it helps to write code that follows the logic of what
    the writer really means, rather than alternative constructions that
    give the same result.

    Using function-like syntax is OK when you have the same operator between multiple terms. 'rising' could have '<' or '<='.

    Yes. I think that is almost certainly what you would want, except in
    the case of checking if a value is in a half-open interval. I would be
    happy with an "in" operator and ranges of some sort for that case.


    All-equal would have the same operator too, but it looks clunkier, and a
    bit over-the-top:

       a = b                2 terms
       all_equal(a, b, c)   3 terms using your feature
       a = b = c            3 terms using chained ops

       rising(a, b, c)      Using your other feature
       a <= b <= c          Using the same chained-op feature


    I guess a lot of this ends up as a matter of taste.

    Your solution requires a heavyweight language feature. It also looks
    like it will generate a lot of intermediate code that will need a
    heavyweight optimiser to tear down again.


    I place approximately zero weight on requirements for code generators to
    be optimising. It is irrelevant to the user - it only matters to the implementer of the tools. It does have to be /possible/ to implement
    the feature, but it does not matter if the compiler has to optimise well
    to make it efficient.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)