Forum: >>> Magnum BBS <<<

Interval Comparisons

From Lawrence D'Oliveiro@21:1/5 to All on Tue Jun 4 07:14:02 2024

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Jun 4 10:58:53 2024

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as the expression "(a <= x) <= b". So you would be changing the meaning of
existing C code. I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C. (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mikko@21:1/5 to David Brown on Tue Jun 4 12:13:15 2024

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as the expression "(a <= x) <= b". So you would be changing the meaning of
existing C code. I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C. (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

--
Mikko

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Tue Jun 4 11:39:41 2024

On 04/06/2024 08:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Yes it would break compatibility. The first '0 <= c' yields a 0 or 1 value.

But Python can also do it as `c in range(0, 0x10FFFF+1)`.

That could conceivably be added; the main obstacle would be introducing
that new `in` keyword, while a better solution than `range` would be likely.

The chances of it actually happening are infinitesimal, and I'd be long
dead before it become widely available.

This is the upside of devising your own language; I daily use these forms:

a <= b <= c
b in a .. c

in my systems language. The only stipulation with the first form is that
if there are any angle brackets, then they all point the same way,
otherwise the result is too confusing.

The language also needs to ensure middle terms of evaluated only once.

If I ever want to have the C meaning of 'a <= b <= c' (say I'm porting
some code), then it can be written like this to break it up:

(a <= b) <= c

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Mikko on Tue Jun 4 13:02:03 2024

On 04/06/2024 11:13, Mikko wrote:

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an
interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as the
expression "(a <= x) <= b". So you would be changing the meaning of
existing C code. I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current
interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C. (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

Sure, you can always add new things to a language if they would
previously have been syntax errors or constraint errors. But is there a
use for it?

It is fine if you have a language that has good support for lists, sets, ranges, and other higher-level features - then an "in" keyword is a
great idea. But C is not such a language, and that kind of feature
would be well outside the scope of the language.

It would be easy enough to write a macro "in_range(a, x, b)" that would
do the job. It is even easier, and more productive, that you simply
write the "valid_char" function and use it, if that's what you need.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 4 12:23:15 2024

On 04/06/2024 12:02, David Brown wrote:

On 04/06/2024 11:13, Mikko wrote:

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this >>>> from Python? Namely, the ability to check if a value lies in an
interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as the
expression "(a <= x) <= b". So you would be changing the meaning of
existing C code. I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current
interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C. (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

Sure, you can always add new things to a language if they would
previously have been syntax errors or constraint errors. But is there a
use for it?

It is fine if you have a language that has good support for lists, sets, ranges, and other higher-level features - then an "in" keyword is a
great idea. But C is not such a language, and that kind of feature
would be well outside the scope of the language.

I disagree. I have a script language where 'in' works with all sorts of
data types, and where ranges like a..b and sets like [a..b, c, d, e] are
actual types.

Yet I also introduced 'in' into my systems language, even though it is
very restricted:

if a in b..c then
if a in [b, c, d] then

This is limited to integer types. The set construct here doesn't allow
ranges (it could have done). Neither the range or set is a datatype - it
just syntax. (I can't do range r := 1..10.)

It is incredibly useful:

if c in [' ', '\t', '\n'] then ... # whitespace
if b in 0..255 then
if b in u8.bounds then # alternative

Not to forget:

if x = y = 0 then # both x and y are zero

It doesn't need the full spec of the higher level language.

It would be easy enough to write a macro "in_range(a, x, b)" that would
do the job. It is even easier, and more productive, that you simply
write the "valid_char" function and use it, if that's what you need.

Yes it would be easier - to provide an ugly, half-assed solution that
everyone will write a different way (I would use (x, a, b) for example),
and which can go wrong as soon as someone writes (a, x(), b).

That's the problem with the macro scheme, it stops the language properly evolving.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Blue-Maned_Hawk@21:1/5 to All on Tue Jun 4 11:41:54 2024

Ignoring the concept of backcompat, operators as a concept are bad enough;
i think that we need not worsen the matter with new ternary operators.

--
Blue-Maned_Hawk│shortens to Hawk│/blu.mɛin.dʰak/│he/him/his/himself/Mr. blue-maned_hawk.srht.site
INVINCIBLE MOOSE NEXT FIVE KILOMETERS

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mikko@21:1/5 to David Brown on Tue Jun 4 16:11:28 2024

On 2024-06-04 11:02:03 +0000, David Brown said:

On 04/06/2024 11:13, Mikko wrote:

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this >>>> from Python? Namely, the ability to check if a value lies in an interval: >>>>
def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code?� The answer is no, C treats it as the
expression "(a <= x) <= b".� So you would be changing the meaning of
existing C code.� I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current
interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C.� (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

�return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

�return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

Sure, you can always add new things to a language if they would
previously have been syntax errors or constraint errors. But is there
a use for it?

I don't see any need. That c must be mentioned twice for each interval is
not a problem. If there is a complex expression in place of c it can be computed and stored to a variable before comparison to an interval.

It is fine if you have a language that has good support for lists,
sets, ranges, and other higher-level features - then an "in" keyword is
a great idea. But C is not such a language, and that kind of feature
would be well outside the scope of the language.

Or, if one for some reason does it in C anyway, one should have or make
a library of the essential functions, incuding membership tests.

It would be easy enough to write a macro "in_range(a, x, b)" that would
do the job. It is even easier, and more productive, that you simply
write the "valid_char" function and use it, if that's what you need.

Indeed.

--
Mikko

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to [email protected] on Tue Jun 4 15:17:30 2024

On Tue, 4 Jun 2024 11:41:54 -0000 (UTC)
Blue-Maned_Hawk <[email protected]d> wrote:

operators as a concept are bad enough;

Those are fighting words! Care to suggest an alternative?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Mikko on Tue Jun 4 15:42:14 2024

On 04.06.2024 11:13, Mikko wrote:

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an
interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF

While nice to have it's just syntactic sugar.

and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as the
expression "(a <= x) <= b". So you would be changing the meaning of
existing C code. I think it's fair to say there is likely to be very
little existing correct and working C code that relies on the current
interpretation of such expressions, but the possibility is enough to
rule out such a change ever happening in C. (And it would also
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

Introducing a new keyword 'in' would also break a lot of code, even
more code than the syntactic change ( . <= . <= . ) mentioned above
in the OP, don't you think?

or

return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

Not worth the hassle, IMO.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 4 15:24:43 2024

On 04/06/2024 13:23, bart wrote:

On 04/06/2024 12:02, David Brown wrote:

On 04/06/2024 11:13, Mikko wrote:

On 2024-06-04 08:58:53 +0000, David Brown said:

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:

Would it break backward compatibility for C to add a feature like this >>>>> from Python? Namely, the ability to check if a value lies in an
interval:

def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
without breaking existing code? The answer is no, C treats it as
the expression "(a <= x) <= b". So you would be changing the
meaning of existing C code. I think it's fair to say there is
likely to be very little existing correct and working C code that
relies on the current interpretation of such expressions, but the
possibility is enough to rule out such a change ever happening in
C. (And it would also complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

  return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

  return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

Sure, you can always add new things to a language if they would
previously have been syntax errors or constraint errors. But is there
a use for it?

It is fine if you have a language that has good support for lists,
sets, ranges, and other higher-level features - then an "in" keyword
is a great idea. But C is not such a language, and that kind of
feature would be well outside the scope of the language.

I disagree. I have a script language where 'in' works with all sorts of
data types, and where ranges like a..b and sets like [a..b, c, d, e] are actual types.

C is not a script language.

Yet I also introduced 'in' into my systems language, even though it is
very restricted:

    if a in b..c then
    if a in [b, c, d] then

This is limited to integer types. The set construct here doesn't allow
ranges (it could have done). Neither the range or set is a datatype - it
just syntax. (I can't do range r := 1..10.)

Adding such a feature to your own personal language, for your own
personal use, is easy enough (relative to the rest of the work involved
in designing your own personal language and making tools for it, which
is of course no small feat). Adding it to C with its standards,
existing code, toolchains, additional tools, developers, etc., is a
whole different kettle of fish.

I don't think it would be practical to add it to C in a way that is
simple and restricted enough to be suitable for C, while also being
useful enough to make it worth the effort.

Remember, when you add these things to your own language, you have your
own needs in mind and can ignore everything else, all corner cases, and
all complications. Putting a feature in C means making decisions like
figuring out what type the expression "b..c" has, whether the various
bits and pieces have to be constants or if they can be variables, how
the operator precedences work, how to treat floating point numbers or
mixes of different types, and countless other factors. If a language
already has the concepts, rules and grammar for ranges or lists, adding
an "in" operator is natural - if not, then it's a huge amount of extra
junk pulled into the language and syntax for a very minor gain.

I don't disagree that it could be useful, and I'm sure I'd use it if it
existed in C, I just disagree that it makes sense in C.

It is incredibly useful:

   if c in [' ', '\t', '\n'] then ... # whitespace
   if b in 0..255 then
   if b in u8.bounds then             # alternative

Not to forget:

   if x = y = 0 then                  # both x and y are zero

It doesn't need the full spec of the higher level language.

It would be easy enough to write a macro "in_range(a, x, b)" that
would do the job. It is even easier, and more productive, that you
simply write the "valid_char" function and use it, if that's what you
need.

Yes it would be easier - to provide an ugly, half-assed solution that

You and I are British - the term is "half-arsed" :-)

everyone will write a different way (I would use (x, a, b) for example),
and which can go wrong as soon as someone writes (a, x(), b).

That's the problem with the macro scheme, it stops the language properly evolving.

If it were considered useful enough, it could be standardised in the C
library. If it is not useful enough to standardise in the library, it
is certainly not useful enough to put in the language itself.

In practice, while I would put something like this in a new language, I
don't think it is important enough to try to add to C. When you need to
do a lot of checks, you'd put them within a function (or macro if you
prefer), such as "isspace()".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 4 15:16:57 2024

On 04/06/2024 14:24, David Brown wrote:

On 04/06/2024 13:23, bart wrote:

On 04/06/2024 12:02, David Brown wrote:

It is fine if you have a language that has good support for lists,
sets, ranges, and other higher-level features - then an "in" keyword
is a great idea. But C is not such a language, and that kind of
feature would be well outside the scope of the language.

I disagree. I have a script language where 'in' works with all sorts
of data types, and where ranges like a..b and sets like [a..b, c, d,
e] are actual types.

C is not a script language.

Yet I also introduced 'in' into my systems language, even though it is
very restricted:

if a in b..c then
if a in [b, c, d] then

This is limited to integer types. The set construct here doesn't allow
ranges (it could have done). Neither the range or set is a datatype -
it just syntax. (I can't do range r := 1..10.)

Adding such a feature to your own personal language, for your own
personal use, is easy enough (relative to the rest of the work involved
in designing your own personal language and making tools for it, which
is of course no small feat). Adding it to C with its standards,
existing code, toolchains, additional tools, developers, etc., is a
whole different kettle of fish.

I was responding to your comment:

"and that kind of feature would be well outside the scope of the language."

I think it can suit that level of language if you avoid being too ambitious.

I agree it is not practical to apply to C at this point, not without
making it ugly or unwieldy enough that people might as well use existing solutions.

(Such a feature also aids simpler non-optimising compilers. Take these
examples that all do the same thing:

if a <= f() and f() <= c then fi

if a <= f() <= c then fi

if f() in a..c then fi

If the two f() calls in the first example were considered common subexpressions, I don't have the means in my compiler to detect that
that and evaluate them just once.

In the other two examples, the language lets you express that directly.

Even for a simpler 'b in a..c' example, it is easier to generate more
efficient code, and do that more efficiently too than building something
up only to tear it down again.)

It would be easy enough to write a macro "in_range(a, x, b)" that
would do the job. It is even easier, and more productive, that you
simply write the "valid_char" function and use it, if that's what you
need.

Yes it would be easier - to provide an ugly, half-assed solution that

You and I are British - the term is "half-arsed" :-)

I'm catering for a wider readership.

(Actually I'm not quite considered British enough to be allowed in the
upcoming election.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 4 17:40:22 2024

On 04/06/2024 16:16, bart wrote:

On 04/06/2024 14:24, David Brown wrote:

On 04/06/2024 13:23, bart wrote:

On 04/06/2024 12:02, David Brown wrote:

It is fine if you have a language that has good support for lists,
sets, ranges, and other higher-level features - then an "in" keyword
is a great idea. But C is not such a language, and that kind of
feature would be well outside the scope of the language.

I disagree. I have a script language where 'in' works with all sorts
of data types, and where ranges like a..b and sets like [a..b, c, d,
e] are actual types.

C is not a script language.

Yet I also introduced 'in' into my systems language, even though it
is very restricted:

     if a in b..c then
     if a in [b, c, d] then

This is limited to integer types. The set construct here doesn't
allow ranges (it could have done). Neither the range or set is a
datatype - it just syntax. (I can't do range r := 1..10.)

Adding such a feature to your own personal language, for your own
personal use, is easy enough (relative to the rest of the work
involved in designing your own personal language and making tools for
it, which is of course no small feat). Adding it to C with its
standards, existing code, toolchains, additional tools, developers,
etc., is a whole different kettle of fish.

I was responding to your comment:

"and that kind of feature would be well outside the scope of the language."

I think it can suit that level of language if you avoid being too
ambitious.

It might be that we would agree on that if we worked hard enough to find
a common definition for "that level of language". But I think that
would be a lot of time and effort for little purpose. I do agree that
with enough limitation in the scope of the feature, it is less
unreasonable for a low-level language. But I think I would want to
limit the scope until there is little point in the "in" operator - or I
would want to go the other direction and define something like Pascal's
sets with many more operators and uses.

I agree it is not practical to apply to C at this point, not without
making it ugly or unwieldy enough that people might as well use existing solutions.

Yes.

(Such a feature also aids simpler non-optimising compilers. Take these examples that all do the same thing:

    if a <= f() and f() <= c then fi

    if a <= f() <= c then fi

    if f() in a..c then fi

If the two f() calls in the first example were considered common subexpressions, I don't have the means in my compiler to detect that
that and evaluate them just once.

I see your point, but I rate the design and use of a language as /much/
more important than the ease of implementation. I realise the balance
is a bit different when the user is the implementer.

In the other two examples, the language lets you express that directly.

Even for a simpler 'b in a..c' example, it is easier to generate more efficient code, and do that more efficiently too than building something
up only to tear it down again.)

It would be easy enough to write a macro "in_range(a, x, b)" that
would do the job. It is even easier, and more productive, that you
simply write the "valid_char" function and use it, if that's what
you need.

Yes it would be easier - to provide an ugly, half-assed solution that

You and I are British - the term is "half-arsed" :-)

I'm catering for a wider readership.

We can educate them!

(Actually I'm not quite considered British enough to be allowed in the upcoming election.)

I can't vote either, but that's because I don't live in the UK. And
given the state of UK politics these days, I'm happy to be out of it.
For quite a while, the Scottish Parliament were looking like the adults
in the room, but they've managed to mess things up for themselves too.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Tue Jun 4 16:58:43 2024

On 04/06/2024 16:27, Scott Lurndal wrote:

David Brown <[email protected]> writes:

On 04/06/2024 13:23, bart wrote:

It is incredibly useful:

if c in [' ', '\t', '\n'] then ... # whitespace

if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

That doesn't do the same thing. In my example, c is a character, not a
string.

To achieve the same thing using strpbrk requires code like this:

char c[2];

c[0]=rand()&255; // Create a string
c[1]=0;

if (strpbrk(c, " \t\n") != NULL) puts("whitespace");

If I compile this with gcc -O3, then the checking part is this:

lea rcx, 46[rsp]
mov BYTE PTR 47[rsp], 0
lea rdx, .LC0[rip]
mov BYTE PTR 46[rsp], al
call strpbrk // CALL TO LIBRARY FUNCTION
test rax, rax
je .L2
lea rcx, .LC1[rip]
call puts

I don't know what it gets up to inside strprbk. If I write this in my
language:

if c in [9,10,32] then
puts("whitespace")
fi

The generated code is this (using alternate register names, D0 = rax):

mov D0, D3 # (could have tested D3 (= c) directly.)
cmp D0, 9
jz L4
cmp D0, 10
jz L4
cmp D0, 32
jnz L3
L4:
lea D10, [L5]
call puts*
L3:

Anyway, the construct is not limited to character codes that can be
contained within a string. It works for 64-bit values which can include
0. And it could be extended to other scalar types like floats and pointers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Tue Jun 4 15:27:31 2024

David Brown <[email protected]> writes:

On 04/06/2024 13:23, bart wrote:

It is incredibly useful:

if c in [' ', '\t', '\n'] then ... # whitespace

if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

If it were considered useful enough, it could be standardised in the C >library. If it is not useful enough to standardise in the library, it
is certainly not useful enough to put in the language itself.

indeed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Michael S on Tue Jun 4 17:54:33 2024

On 04/06/2024 17:25, Michael S wrote:

On Tue, 4 Jun 2024 16:58:43 +0100
bart <[email protected]> wrote:

On 04/06/2024 16:27, Scott Lurndal wrote:

David Brown <[email protected]> writes:

On 04/06/2024 13:23, bart wrote:

It is incredibly useful:

if c in [' ', '\t', '\n'] then ... # whitespace

if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

That doesn't do the same thing. In my example, c is a character, not
a string.

Will that be be better?
if (memchr(" \t\n", c, 3) != NULL)

It's a better match. But on gcc-O3, it still calls the library function
up to version 12.x. After that, it's smart enough to generate similar
code to my non-optimising compiler using the built-in feature.

It is also still limited to byte values.

My approach is to fix a language, which is easier, than to expend
magnitudes more effort in elaborate tools and ultra-smart compilers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Tue Jun 4 19:25:47 2024

On Tue, 4 Jun 2024 16:58:43 +0100
bart <[email protected]> wrote:

On 04/06/2024 16:27, Scott Lurndal wrote:

David Brown <[email protected]> writes:

On 04/06/2024 13:23, bart wrote:

It is incredibly useful:

if c in [' ', '\t', '\n'] then ... # whitespace

if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

That doesn't do the same thing. In my example, c is a character, not
a string.

Will that be be better?
if (memchr(" \t\n", c, 3) != NULL)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Tue Jun 4 23:12:07 2024

On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

i think that we need not worsen the matter with new ternary operators.

These are not ternary operators.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Wed Jun 5 00:22:36 2024

On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:

On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

i think that we need not worsen the matter with new ternary operators.

These are not ternary operators.

So what are they?

I've implemented them several times, and found they really need to be
treated as a special kind of n-ary opterator.

An AST node for A+B say would have single operator "+", and two operands.

One for "A<=B<=C" would have two operators "<=" and "<", and three operands.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Wed Jun 5 01:30:57 2024

On Wed, 5 Jun 2024 00:22:36 +0100, bart wrote:

On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:

On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

i think that we need not worsen the matter with new ternary operators.

These are not ternary operators.

So what are they?

A special case in the syntax rules for the comparison operators <https://docs.python.org/3/reference/expressions.html#comparisons>.

I've implemented them several times, and found they really need to be
treated as a special kind of n-ary opterator.

Remember, Python allows users to define custom overloads for the standard operators. For comparisons, these functions always take two operands, and
the compiler takes care of invoking them correctly to handle interval comparisons.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Wed Jun 5 03:29:18 2024

On Tue, 4 Jun 2024 12:23:15 +0100, bart wrote:

That's the problem with the macro scheme, it stops the language properly evolving.

The problem is the way C does macros. Other languages with powerful macros (*cough* Lisp *cough*) aren’t stopped from evolving; quite the opposite.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Thu Jun 6 19:48:42 2024

On 05/06/2024 02:30, Lawrence D'Oliveiro wrote:

On Wed, 5 Jun 2024 00:22:36 +0100, bart wrote:

On 05/06/2024 00:12, Lawrence D'Oliveiro wrote:

On Tue, 4 Jun 2024 11:41:54 -0000 (UTC), Blue-Maned_Hawk wrote:

i think that we need not worsen the matter with new ternary operators.

These are not ternary operators.

So what are they?

A special case in the syntax rules for the comparison operators <https://docs.python.org/3/reference/expressions.html#comparisons>.

I've implemented them several times, and found they really need to be
treated as a special kind of n-ary opterator.

Remember, Python allows users to define custom overloads for the standard operators. For comparisons, these functions always take two operands, and
the compiler takes care of invoking them correctly to handle interval comparisons.

Well, for these 3 lines in my scripting language:

if a = b then end # universal
if a = b < c then end # chained (like Python, unlike C)
if (a = b) < c then end # emulate C behaviour

These are the ASTs produced (2: is the empty True branch; 3: would be
for the 'else' branch, not present here):

- 1 if:
- - 1 eq:
- - - 1 name: a
- - - 2 name: b
- - 2 block:

- 1 if:
- - 1 cmpchain: eq lt
- - - 1 name: a
- - - 1 name: b
- - - 1 name: c
- - 2 block:

- 1 if:
- - 1 lt:
- - - 1 eq:
- - - - 1 name: a
- - - - 2 name: b
- - - 2 name: c
- - 2 block:

Notice the middle one is one linear group with N operands and N-1
comparisons.

No operator overloads are allowed, but if they were, it would still
work, but a comparison operator would be required to return True or
False from its two operands. It would be unwise for it to return a
string for example.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Thu Jun 6 22:54:38 2024

On Thu, 6 Jun 2024 19:48:42 +0100, bart wrote:

if a = b then end # universal if a = b < c then end

Really?? You allow that?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Fri Jun 7 01:52:34 2024

On 06/06/2024 23:54, Lawrence D'Oliveiro wrote:

On Thu, 6 Jun 2024 19:48:42 +0100, bart wrote:

if a = b then end # universal if a = b < c then end

Really?? You allow that?

Which bit: The 'a=b' (that's equality); the 'a=b<c' (just like Python);
or 'then end' (just an empty block)?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Fri Jun 7 02:17:22 2024

On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:

The 'a=b' (that's equality)

Not in C it isn’t.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Fri Jun 7 04:25:17 2024

On Thu, 06 Jun 2024 20:53:41 -0700, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:

The 'a=b' (that's equality)

Not in C it isn’t.

Of course not. You snipped the part where Bart very clearly said that
he was talking about his own scripting language.

Oh, sorry.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Fri Jun 7 11:22:12 2024

On 07/06/2024 05:53, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Fri, 7 Jun 2024 01:52:34 +0100, bart wrote:

The 'a=b' (that's equality)

Not in C it isn’t.

Of course not. You snipped the part where Bart very clearly said that
he was talking about his own scripting language. And we're talking
about a proposed new C feature, so I have no problem with references to
other languages.

And there's precedent in other languages (Python, as Bart already
pointed out) for `a == b < c` being equivalent to `a == b && b < c`,
but with b evaluated only once.

*If* C were to adopt chained comparisons, I would have no problem
with `a == b < c` being supported with the obvious meaning.
I dislike arbitrary restrictions. (Though the fact that == and
< have different precedences would have to be resolved somehow.)
In principle it could quietly change the behavior of existing code,
but it's likely that most such code was already wrong. I don't
advocate making such a change, and I don't think it's likely to
happen, I wouldn't object to it.

While it is true that such an addition to C would be very unlikely to
break existing code (any current code that uses "a == b < c" or "a < b <
c" is probably incorrect), there is a potentially serious consequence
that has not been considered here.

Suppose C26 allows "a < b < c" and "a == b < c" chains, like Python, and
some people start using it in real code. You are going to get two
effects. One is that some people will read that new code but not know
the new interpretation. They will think the code parses as "a == (b <
c)", and is likely a mistake, or does something different from what it
now actually does.

The other is that some people will get used to it and think this is how
C treats chained operators. The code or similar expressions will end up
in C code that is compiled under different standards. Old C standards
are used all the time - there are still some people who seem to think
new coding in C89/C90 is a /feature/, rather than historical
re-enactment. You would get code that is tested and correct in C26 used incorrectly as C23 or older.

There is also the C++ compatibility question. C++ provides flexible
operator overloading combined with a poverty of available operators, so
the relational operators <, >, <= and >= are sometimes used for
completely different purposes, similar to the >> and << operators.
Chaining relational operators would complicate this significantly, I
think. If C++ adopted the feature it would be a mess to support
overloading, and if they did not, "a < b < c" in C and C++ would be
valid but with completely different semantics. Neither option is good.

To me, this possibility, along with the confusion it would cause,
totally outweighs any potential convenience of chained comparisons. I
have never found them helpful in Python coding, and I can't imagine them
being of any interest in my C code.

Even in a new language, I would not want to see chained relational
operators unless you had a strict requirement that relational operators evaluate to a boolean type with no support for relational operators
between booleans, and no support for comparison or other operators
between booleans and other types. And even then, what is "a == b == c" supposed to mean, or "a != b != c" ?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Jun 7 10:42:03 2024

On Fri, 7 Jun 2024 11:22:12 +0200, David Brown wrote:

C++ provides flexible
operator overloading combined with a poverty of available operators, so
the relational operators <, >, <= and >= are sometimes used for
completely different purposes, similar to the >> and << operators.
Chaining relational operators would complicate this significantly, I
think.

Python offers just as flexible operator overloading, and it can handle
this feature without any trouble.

Of course, Python has only a fraction of the complexity of C++. It could
be that an unexpected interaction with some other weird C++ feature could
cause problems.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Fri Jun 7 10:45:05 2024

On Fri, 7 Jun 2024 11:28:23 +0100, bart wrote:

However, 'a > b <= c' is not clear.

One thing it would have been handy to have is some way of saying

b < a or b > c

in a chained comparison somehow. In other words, “not (a < b < c)” without the negation and need for the parentheses (and correspondingly for the
obvious usages of “<=” and “>=”).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 7 11:28:23 2024

On 07/06/2024 10:22, David Brown wrote:

To me, this possibility, along with the confusion it would cause,
totally outweighs any potential convenience of chained comparisons.

Even in a new language, I would not want to see chained relational
operators unless you had a strict requirement that relational operators evaluate to a boolean type with no support for relational operators
between booleans, and no support for comparison or other operators
between booleans and other types.

And even then, what is "a == b == c"
supposed to mean, or "a != b != c" ?

Some combinations are confusing, and in my languages I would suggest
they are avoided (but stop short of banning them).

Ones like 'a == b == c' are straightforward: you're testing that all a,
b, c have the same value.

With relational ops like 'a < b <= c', that means:

a < b' && b' << c

with b' representing the same copy of b (which can be an arbitrary term)
should be used.

However, 'a > b <= c' is not clear. While the above indicate a
relationship between a and c when the whole expression is True, here you
can't deduce any such relationship; all of these could be True:

a > c, a == c, a < c

And there was one more I'd forgotten about:

a != b != c

This looks very much like the exact opposite of ' a == b == c', but it
isn't! (I think it would need a != b != c != a).

So the restrictions I would suggest are:

* Not mixing <, <= with >, >= in the same chain (any angle brackets
should point the same way)

* Not allowing != in the chain.

Such a chain also requires that all 6 (or 5) operators have the same precedence, as you can't have 'a = b <= c' mean 'a = (b <= c)'.

I
have never found them helpful in Python coding, and I can't imagine them being of any interest in my C code.

The most common uses for me are comparing that N terms are equal (here =
means equality):

if x.tag = y.tag = ti64
if a = b = 0

I also used them for range-checking:

if a <= b <= c

until I introduced 'if b in a..c' for that purpose. (However I would
still need 'a <= b <= c' for floating point.)

What I might then suggest for C, as a first step, is to allow only
chained '==' comparisons. That avoids those ambiguous cases, and also
the problem with mixed precedence, while still providing a handy new
feature.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Fri Jun 7 13:04:34 2024

On 07/06/2024 11:55, Keith Thompson wrote:

David Brown <[email protected]> writes:

On 07/06/2024 05:53, Keith Thompson wrote:

[...]

There is also the C++ compatibility question. C++ provides flexible
operator overloading combined with a poverty of available operators,
so the relational operators <, >, <= and >= are sometimes used for
completely different purposes, similar to the >> and <<
operators. Chaining relational operators would complicate this
significantly, I think. If C++ adopted the feature it would be a mess
to support overloading, and if they did not, "a < b < c" in C and C++
would be valid but with completely different semantics. Neither
option is good.

I mentioned earlier that someone did a study of open source C++ code and found no occurrences of things like "a < b < c", except for 4 cases that
were intended to be chained but would behave incorrectly. I presume
that this study would have covered overloaded operators.

You helpfully quoted from that study, and it included :

- Many instances of using successive comparison operators in DSLs that
overloaded these operators to give meaning unrelated to comparisons.

So yes, it seems that overloading the relational operators for other
purposes and then chaining them is a real thing.

To me, this possibility, along with the confusion it would cause,
totally outweighs any potential convenience of chained comparisons. I
have never found them helpful in Python coding, and I can't imagine
them being of any interest in my C code.

I agree. I wouldn't mind being able to use the feature, and I think
I've actually used it in Python, but its lack isn't a big problem.

Even in a new language, I would not want to see chained relational
operators unless you had a strict requirement that relational
operators evaluate to a boolean type with no support for relational
operators between booleans, and no support for comparison or other
operators between booleans and other types.

In Python, all comparison operators (<, >, ==, >=, <=, !=, is, is not,
in, not in) have the same precedence, and chained operations are
specified straightforwardly. They evaluate to a bool result. Boolean
values can be compared (False < True), which doesn't seem to cause any problems.

https://docs.python.org/3/reference/expressions.html#comparisons

And even then, what is "a
== b == c" supposed to mean, or "a != b != c" ?

"a == b && b == c", and "a != b && b != c", respectively, except that b
is only evaluated once.

If "c" is a boolean, some might think the "natural" interpretation of "a
== b == c" is "(a == b) == c" - it is the current semantics in C. Some
people may think that "a != b != c" should be interpreted as "(a != b) &
(b != c) & (a != c)".

It's one thing to make a rigid definition of the meaning in a language,
picking a consistent set of rules of precedence and syntax. It is
another thing to make sure it matches up with the interpretations people
have from normal mathematics, natural language, and other programming languages. When there is a mismatch, you need good reasons to accept
the syntax as a good language design idea - the more likely the misunderstanding, the stronger reasons you need.

To me, the potential misunderstandings of including != in chains is far
too high in comparison to the meagre benefits. The use of == could be
clear in some situations (combined with strong type checking to help
catch mistakes) but not others. I could see a chain of a mix of < and
<= making sense, or of > and >=, and occasionally being useful. I don't
think there is a point in allowing more than that.

After all, if all you need is to avoid evaluating "b" more than once,
you can just do:

auto const b_ = b;

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Lawrence D'Oliveiro on Fri Jun 7 14:51:09 2024

On Fri, 7 Jun 2024 10:45:05 -0000 (UTC)
Lawrence D'Oliveiro <[email protected]d> wrote:

On Fri, 7 Jun 2024 11:28:23 +0100, bart wrote:

However, 'a > b <= c' is not clear.

One thing it would have been handy to have is some way of saying

b < a or b > c

in a chained comparison somehow. In other words, “not (a < b < c)” without the negation and need for the parentheses (and
correspondingly for the obvious usages of “<=” and “>=”).

"Out of range" expressed as negation of "in range" looks to me as least confusing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Jun 7 13:17:50 2024

On 07/06/2024 12:28, bart wrote:

On 07/06/2024 10:22, David Brown wrote:

So the restrictions I would suggest are:

* Not mixing <, <= with >, >= in the same chain (any angle brackets
should point the same way)

* Not allowing != in the chain.

I think these are good ideas.

Such a chain also requires that all 6 (or 5) operators have the same precedence, as you can't have 'a = b <= c' mean 'a = (b <= c)'.

I
have never found them helpful in Python coding, and I can't imagine them being of any interest in my C code.

The most common uses for me are comparing that N terms are equal (here = means equality):

if x.tag = y.tag = ti64
if a = b = 0

These do not correspond to what you want to say.

If someone has balloons, and you want to check that they have 2 red
balloons and two yellow balloons, you do start off by checking if the
number of red balloons is the same as the number of yellow balloons, and
then that the number of yellow balloons is 2.

Code syntax should, where practical, reflect its purpose and intent.
You should therefore write (adjusting to C, Python, or your own syntax
as desired) :

if red_balloons == 2 and yellow_balloons == 2 ...

If you don't want to write the "magic number" 2 twice, you give it a name :

expected_balloons = 2
if red_balloons == expected_balloons
and yellow_balloons == expected_balloons...

I also used them for range-checking:

if a <= b <= c

until I introduced 'if b in a..c' for that purpose. (However I would
still need 'a <= b <= c' for floating point.)

Using "in" and a range or interval syntax would usually be clearer, and
closer to the intended meaning, IMHO.

Both of these chained shortcuts are used in mathematics, so they are not unfamiliar. But in writing mathematics, compared to programming, you
have a lot more freedom to expect the reader to interpret things
sensibly, and vastly more freedom in layout while also having an
incentive to keep things compact. Good or common mathematical syntax
does not necessarily translate directly to good programming syntax.

What I might then suggest for C, as a first step, is to allow only
chained '==' comparisons. That avoids those ambiguous cases, and also
the problem with mixed precedence, while still providing a handy new
feature.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 7 13:20:33 2024

On 07/06/2024 12:17, David Brown wrote:

On 07/06/2024 12:28, bart wrote:

On 07/06/2024 10:22, David Brown wrote:

; I
have never found them helpful in Python coding, and I can't imagine

them

being of any interest in my C code.

The most common uses for me are comparing that N terms are equal (here
= means equality):

   if x.tag = y.tag = ti64
   if a = b = 0

These do not correspond to what you want to say.

If someone has balloons, and you want to check that they have 2 red
balloons and two yellow balloons, you do start off by checking if the
number of red balloons is the same as the number of yellow balloons, and
then that the number of yellow balloons is 2.

("you don't"?)

That's not quite the intent of my examples, which is:

(1) That x.tag/y.tag or a/b are equal to each other

(2) That they also have a particular value

Code syntax should, where practical, reflect its purpose and intent. You should therefore write (adjusting to C, Python, or your own syntax as desired) :

    if red_balloons == 2 and yellow_balloons == 2 ...

Here that connection is lost. You might infer it, but you don't know
whether, at some point, the program could be changed to require 3 red
balloons, while still needing 2 yellow ones.

Or someone could just write 3 by mistake. By repeating such terms, there
is more opportunity for mistakes, and the connection between terms is
looser.

Here is another real example, first written in static code (not in C;
I've shortened 'sample' to avoid line-wrap):

if hdr.hsamp[2] = hdr.vsamp[2] = hdr.hsamp[3] = hdr.vsamp[3] and
(hdr.hsamp[1] <= 2 and hdr.vsamp[1] <= 2) then

pimage := loadcolour(fs, hdr.hsamp[1], hdr.vsamp[1])

and here in dynamic code:

(vsample1, vsample2, vsample3) := hdr.vsample
(hsample1, hsample2, hsample3) := hdr.hsample
...
if hsample2 = vsample2 = hsample3 = vsample3 and
(hsample1 <= 2 and vsample1 <=2 ) then
....

Both check that 4 terms are identical, but there is no specific value
they have to be.

(It would have been nice to avoid that repetition of '2' in that second
test, but that's more difficult to achieve. There, hsample1/vsample1
don't need to have the same value.

It's expressible as something like 'reduce(and, map(<=, (a, b), 2)' but
that's using a sledgehammer just to avoid that duplicate '2'. It's not
suited to lower-level code either.)

If you don't want to write the "magic number" 2 twice, you give it a name :

    expected_balloons = 2
    if red_balloons == expected_balloons
        and yellow_balloons == expected_balloons...

I also used them for range-checking:

   if a <= b <= c

until I introduced 'if b in a..c' for that purpose. (However I would
still need 'a <= b <= c' for floating point.)

Using "in" and a range or interval syntax would usually be clearer, and closer to the intended meaning, IMHO.

Both of these chained shortcuts are used in mathematics, so they are not unfamiliar. But in writing mathematics, compared to programming, you
have a lot more freedom to expect the reader to interpret things
sensibly, and vastly more freedom in layout while also having an
incentive to keep things compact. Good or common mathematical syntax
does not necessarily translate directly to good programming syntax.

What I might then suggest for C, as a first step, is to allow only
chained '==' comparisons. That avoids those ambiguous cases, and also
the problem with mixed precedence, while still providing a handy new
feature.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Fri Jun 7 14:00:30 2024

David Brown <[email protected]> writes:

On 07/06/2024 12:28, bart wrote:

On 07/06/2024 10:22, David Brown wrote:

So the restrictions I would suggest are:

* Not mixing <, <= with >, >= in the same chain (any angle brackets
should point the same way)

* Not allowing != in the chain.

I think these are good ideas.

The most common idiom that I would find useful for chained
comparisons is range checking:

if ((address >= base) && (address < limit)) ...

I'd love a new 'in' operator:

if (address in [base, limit)] ...

Otherwise I agree with you and Keith that a general chained
comparison syntax will problematic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Sat Jun 8 17:42:19 2024

On 07/06/2024 20:57, Keith Thompson wrote:

David Brown <[email protected]> writes:

On 07/06/2024 11:55, Keith Thompson wrote:

David Brown <[email protected]> writes:

On 07/06/2024 05:53, Keith Thompson wrote:

[...]

If "c" is a boolean, some might think the "natural" interpretation of
"a == b == c" is "(a == b) == c" - it is the current semantics in C.
Some people may think that "a != b != c" should be interpreted as "(a
!= b) & (b != c) & (a != c)".

Yes, some people might be wrong.

While you should expect people reading code - and certainly those
writing it - to be fairly familiar with the programming language in
question, not everyone is an expert. So when designing a language or considering its features, you have to look at what a significant
fraction of users might get wrong. I don't know how often people might
get these cases wrong, but I think it has enough potential that I'd be
very wary of allowing it - at least as an addition to C.

It's one thing to make a rigid definition of the meaning in a
language, picking a consistent set of rules of precedence and syntax.
It is another thing to make sure it matches up with the
interpretations people have from normal mathematics, natural language,
and other programming languages. When there is a mismatch, you need
good reasons to accept the syntax as a good language design idea - the
more likely the misunderstanding, the stronger reasons you need.

To me, the potential misunderstandings of including != in chains is
far too high in comparison to the meagre benefits. The use of ==
could be clear in some situations (combined with strong type checking
to help catch mistakes) but not others. I could see a chain of a mix
of < and <= making sense, or of > and >=, and occasionally being
useful. I don't think there is a point in allowing more than that.

After all, if all you need is to avoid evaluating "b" more than once,
you can just do:

auto const b_ = b;

There are two separate issues here.

One is adding chained comparisons to C. We both agree that this is impractical because it would silently change the meaning of valid code.

Yes.

(Changing the meaning of old code isn't likely to be much of an issue,
but any new code using the feature would quietly change behavior when compiled under older C standards or when ported to C++.)

The other (arguably off-topic) is providing chained comparisons in other languages.

I agree that this is a different matter, and for languages that don't
have the level of established history and practice that C does, it could
be a lot less problematic to add chained relational operators.

Python does this well, in my opinion. All comparison
operators have the same precedence, and the semantics of chained
comparisons are defined straightforwardly. There are no arbitrary restrictions, so you can write things that some people might find ugly
or confusing (if you have a language that bans ugly code, I'd like to
see it). The meaning of `a =< b > c` or `a != b == c` is perfectly clear once you understand the rules, and it doesn't change if any of the
operands are of type bool. `a != b != c` *doesn't* mean
`a != b and a != c and b != c`. (If you want to test whether all three
are unequal to each other, you can write `a != b != c != a`, though that evalutes `a` twice.)

Fair enough.

I /do/ find the chained relational operators (especially mixes of
operators) ugly - or at least, I have not had occasion to write Python
code myself where I thought a chain would look clearer than alternatives.

But "ugly" is obviously highly subjective. The only way to ban ugly
code is to do as Bart has done - develop your own language and tools for
your own personal use, and never have to consider any one else's
opinions or preferences!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 9 13:26:03 2024

On 07/06/2024 14:20, bart wrote:

On 07/06/2024 12:17, David Brown wrote:

On 07/06/2024 12:28, bart wrote:

On 07/06/2024 10:22, David Brown wrote:

; I
have never found them helpful in Python coding, and I can't

imagine them

being of any interest in my C code.

The most common uses for me are comparing that N terms are equal
(here = means equality):

   if x.tag = y.tag = ti64
   if a = b = 0

These do not correspond to what you want to say.

If someone has balloons, and you want to check that they have 2 red
balloons and two yellow balloons, you do start off by checking if the
number of red balloons is the same as the number of yellow balloons,
and then that the number of yellow balloons is 2.

("you don't"?)

Yes, sorry about that!

That's not quite the intent of my examples, which is:

(1) That x.tag/y.tag or a/b are equal to each other

(2) That they also have a particular value

If that is your intent, then fair enough. But I think that is an
unusual intent. Of course, I have no idea what "x.tag" or "ti64"
represent, and if I had, the code might have made more sense to me.

At least for the second example, if you did not have chaining equality,
I think you would have preferred to write :

if (a = 0) and (b = 0) ...

rather than

if (a = b) and (b = 0) ...

If that is indeed the case, then IMHO the chaining version is less clear.

Code syntax should, where practical, reflect its purpose and intent.
You should therefore write (adjusting to C, Python, or your own syntax
as desired) :

     if red_balloons == 2 and yellow_balloons == 2 ...

Here that connection is lost. You might infer it, but you don't know
whether, at some point, the program could be changed to require 3 red balloons, while still needing 2 yellow ones.

Or someone could just write 3 by mistake. By repeating such terms, there
is more opportunity for mistakes, and the connection between terms is
looser.

Sure. If this is a risk, use a separate constant (with appropriate
name) for the numbers you want, and write that name rather than the
number 2.

Here is another real example, first written in static code (not in C;
I've shortened 'sample' to avoid line-wrap):

    if hdr.hsamp[2] = hdr.vsamp[2] = hdr.hsamp[3] = hdr.vsamp[3] and
            (hdr.hsamp[1] <= 2 and hdr.vsamp[1] <= 2) then

        pimage := loadcolour(fs, hdr.hsamp[1], hdr.vsamp[1])

and here in dynamic code:

    (vsample1, vsample2, vsample3) := hdr.vsample
    (hsample1, hsample2, hsample3) := hdr.hsample
    ...
    if hsample2 = vsample2 = hsample3 = vsample3 and
            (hsample1 <= 2 and vsample1 <=2 ) then
        ....

Both check that 4 terms are identical, but there is no specific value
they have to be.

(It would have been nice to avoid that repetition of '2' in that second
test, but that's more difficult to achieve. There, hsample1/vsample1
don't need to have the same value.

It's expressible as something like 'reduce(and, map(<=, (a, b), 2)' but that's using a sledgehammer just to avoid that duplicate '2'. It's not
suited to lower-level code either.)

There comes a point when a function called "all_equal", or "rising", is
the right choice. How those functions might be implemented is a matter
of language, and also whether you want them to work with a variable
parameter list or over arrays, lists, slices, or whatever the language supports. High-level functions like map and reduce could be part of
this, along with folds. For example, in C++ 17 (which supports a fold
syntax), you might write :

template <typename H1>
constexpr bool rising(H1)
{
return true;
}

template <typename H1, typename H2, typename ... T>
constexpr bool rising(H1 head1, H2 head2, T... tail)
{
return head1 <= head2 && rising(head2, tail...);
}

Then

rising(a, b, c, d)

is interpreted as

a <= b && rising(b, c, d)

and so on.

I don't think it is fair to claim particular ways of writing these
things are always clearer, or better, or uglier, or unclear - it will
depend on the rest of the language, and how the code is used. But in
general I think it helps to write code that follows the logic of what
the writer really means, rather than alternative constructions that give
the same result.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 10 16:33:12 2024

On 09/06/2024 12:26, David Brown wrote:

On 07/06/2024 14:20, bart wrote:

On 07/06/2024 12:17, David Brown wrote:

On 07/06/2024 12:28, bart wrote:

On 07/06/2024 10:22, David Brown wrote:

; I
have never found them helpful in Python coding, and I can't

imagine them

being of any interest in my C code.

The most common uses for me are comparing that N terms are equal
(here = means equality):

   if x.tag = y.tag = ti64
   if a = b = 0

These do not correspond to what you want to say.

If someone has balloons, and you want to check that they have 2 red
balloons and two yellow balloons, you do start off by checking if the
number of red balloons is the same as the number of yellow balloons,
and then that the number of yellow balloons is 2.

("you don't"?)

Yes, sorry about that!

That's not quite the intent of my examples, which is:

(1) That x.tag/y.tag or a/b are equal to each other

(2) That they also have a particular value

If that is your intent, then fair enough. But I think that is an
unusual intent.

Really, checking that A and B both have the same value X is that unusual?

Of course, I have no idea what "x.tag" or "ti64"
represent,

Here .tag contains type info, and it is checking that objects x and y
both have type i64.

and if I had, the code might have made more sense to me.

At least for the second example, if you did not have chaining equality,
I think you would have preferred to write :

    if (a = 0) and (b = 0) ...

rather than

    if (a = b) and (b = 0) ...

I think both alternates are fine if you did not have the feature being discussed. Although you might prefer the first if b was a more elaborate expression than is shown here.

It does put the onus on the compiler to ensure that repeated terms
(there are four rather than three) is coded efficiently.

Not that the chained version can also be expressed as 'if a = 0 = b' for
a 3-way comparison.

If that is indeed the case, then IMHO the chaining version is less clear.

Code syntax should, where practical, reflect its purpose and intent.
You should therefore write (adjusting to C, Python, or your own
syntax as desired) :

     if red_balloons == 2 and yellow_balloons == 2 ...

Here that connection is lost. You might infer it, but you don't know
whether, at some point, the program could be changed to require 3 red
balloons, while still needing 2 yellow ones.

Or someone could just write 3 by mistake. By repeating such terms,
there is more opportunity for mistakes, and the connection between
terms is looser.

Sure. If this is a risk, use a separate constant (with appropriate
name) for the numbers you want, and write that name rather than the
number 2.

If you have several such names, you can still write the wrong one!

Where you have a relationship that naturally involves 3 terms, and you
express it with 4, then one of them needs to be repeated. But the
language cannot enforce that extra term being one of the first three.

The programmer also now has to deal with a repeated time that might have side-effects.

There comes a point when a function called "all_equal", or "rising", is
the right choice. How those functions might be implemented is a matter
of language, and also whether you want them to work with a variable
parameter list or over arrays, lists, slices, or whatever the language supports. High-level functions like map and reduce could be part of
this, along with folds. For example, in C++ 17 (which supports a fold syntax), you might write :

    template <typename H1>
    constexpr bool rising(H1)
    {
        return true;
    }

    template <typename H1, typename H2, typename ... T>
    constexpr bool rising(H1 head1, H2 head2, T... tail)
    {
        return head1 <= head2 && rising(head2, tail...);
    }

Then

    rising(a, b, c, d)

is interpreted as

    a <= b && rising(b, c, d)

and so on.

I don't think it is fair to claim particular ways of writing these
things are always clearer, or better, or uglier, or unclear - it will
depend on the rest of the language, and how the code is used. But in general I think it helps to write code that follows the logic of what
the writer really means, rather than alternative constructions that give
the same result.

Using function-like syntax is OK when you have the same operator between multiple terms. 'rising' could have '<' or '<='.

All-equal would have the same operator too, but it looks clunkier, and a
bit over-the-top:

a = b 2 terms
all_equal(a, b, c) 3 terms using your feature
a = b = c 3 terms using chained ops

rising(a, b, c) Using your other feature
a <= b <= c Using the same chained-op feature

Your solution requires a heavyweight language feature. It also looks
like it will generate a lot of intermediate code that will need a
heavyweight optimiser to tear down again.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 10 17:56:19 2024

On 10/06/2024 17:33, bart wrote:

On 09/06/2024 12:26, David Brown wrote:

On 07/06/2024 14:20, bart wrote:

On 07/06/2024 12:17, David Brown wrote:

If that is your intent, then fair enough. But I think that is an
unusual intent.

Really, checking that A and B both have the same value X is that unusual?

No. But checking that A and B have the same value, then checking that
one of them as the same value as a constant X, is - I would say -
definitely an unusual way to think about things. It is more natural to
check if A is equal to X, and if B is equal to X.

You seem to disagree with that. Fair enough, it is a subjective opinion.

I don't think it is fair to claim particular ways of writing these
things are always clearer, or better, or uglier, or unclear - it will
depend on the rest of the language, and how the code is used. But in
general I think it helps to write code that follows the logic of what
the writer really means, rather than alternative constructions that
give the same result.

Using function-like syntax is OK when you have the same operator between multiple terms. 'rising' could have '<' or '<='.

Yes. I think that is almost certainly what you would want, except in
the case of checking if a value is in a half-open interval. I would be
happy with an "in" operator and ranges of some sort for that case.

All-equal would have the same operator too, but it looks clunkier, and a
bit over-the-top:

   a = b                2 terms
   all_equal(a, b, c)   3 terms using your feature
   a = b = c            3 terms using chained ops

   rising(a, b, c)      Using your other feature
   a <= b <= c          Using the same chained-op feature

I guess a lot of this ends up as a matter of taste.

Your solution requires a heavyweight language feature. It also looks
like it will generate a lot of intermediate code that will need a
heavyweight optimiser to tear down again.

I place approximately zero weight on requirements for code generators to
be optimising. It is irrelevant to the user - it only matters to the implementer of the tools. It does have to be /possible/ to implement
the feature, but it does not matter if the compiler has to optimise well
to make it efficient.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Rixter
  Wed Jul 29 02:00:40 2026
  from Madison, Nc via Telnet
- Centurion
  Tue Jul 28 22:54:59 2026
  from Berea, Ohio via Telnet
- Bob Worm
  Tue Jul 28 16:01:18 2026
  from Wales, Uk via Telnet
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet
- Ataricrypt
  Mon Jul 27 19:19:17 2026
  from England via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	63:27:52
Calls:	12,446
Calls today:	1
Files:	15,192
Messages:	6,537,529

Interval Comparisons

Who's Online

Recent Visitors

System Info