Forum: >>> Magnum BBS <<<

Regarding assignment to struct

From Lew Pitcher@21:1/5 to All on Fri May 2 18:34:52 2025

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Would code like
struct ab {
int a;
char *b;
} result, function(void);

if ((result = function()).a == 10) puts(result.b);

be understandable, or even legal?

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Barry Schwarz@21:1/5 to [email protected] on Fri May 2 13:35:50 2025

On Fri, 2 May 2025 18:34:52 -0000 (UTC), Lew Pitcher <[email protected]> wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Would code like
struct ab {
int a;
char *b;
} result, function(void);

if ((result = function()).a == 10) puts(result.b);

be understandable, or even legal?

Wouldn't it be quicker and easier to write a simple program to test
this rather than wait for someone to compose a response? You already
have 90% of the code written.

--
Remove del for email

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Lew Pitcher on Fri May 2 21:35:31 2025

Lew Pitcher <[email protected]> wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Typically this is fine. However, in sdcc-4.2 manual one can find
the following statement:

: Deviations from standard compliance:
: structures and unions cannot be passed as function parameters
: and cannot be a return value from a function,....

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Waldek Hebisch on Sat May 3 01:43:54 2025

On Fri, 02 May 2025 21:35:31 +0000, Waldek Hebisch wrote:

Lew Pitcher <[email protected]> wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Typically this is fine. However, in sdcc-4.2 manual one can find
the following statement:

: Deviations from standard compliance:
: structures and unions cannot be passed as function parameters
: and cannot be a return value from a function,....

Not a problem. I don't foresee that the code I'm working on would be
compiled with a non-standard C compiler (assuming that I've read
the standards correctly wrt struct pass-by-value).

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Lew Pitcher on Sat May 3 01:14:46 2025

On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

Weird. Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment... assignment
is used by everyone everywhere without even giving it a second thought.

One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead to
some interesting consequences related to structs with arrays inside.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lew Pitcher on Sat May 3 11:46:30 2025

On 02/05/2025 20:34, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.

Some people mistakenly think that it is very inefficient, in comparison
to passing around pointers to structs (which is the usual alternative).
There are circumstances where you might end up with an extra struct and
an extra copy, but unless you are dealing with very big structs and
otherwise very fast functions, it's unlikely to be significant. Modern
ABI's support passing small structs around in registers, and bigger
structs get passed around using hidden pointers - using the structs in
your code, rather than pointers to structs, makes the code clearer,
safer, and gives the optimiser more information for better static
analysis and code generation.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Would code like
struct ab {
int a;
char *b;
} result, function(void);

if ((result = function()).a == 10) puts(result.b);

be understandable, or even legal?

I'd immediately reject any code that mixes declaration of a variable and
a function in the same declaration. I'd immediately reject any code
that defines a type and declares a function in one shot. I'd question
code that defines a type and a variable in one go. But that's my way of
coding - other people have different rules, and your declarations are legal.

Personally, I'd have :

typedef struct {
int a;
char * b;
} ab;

ab result;

ab function(void);

(Obviously "ab" would not be a likely name in real code.)

Once the type "ab" is defined, I am quite happy making variables of it, assigning them, and using it for function parameters and return types.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Andrey Tarasevich on Sat May 3 22:46:55 2025

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.

There is a caveat, to do with alignment padding: will this always have a defined value?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Damon@21:1/5 to Lew Pitcher on Sat May 3 21:42:37 2025

On 5/2/25 2:34 PM, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Would code like
struct ab {
int a;
char *b;
} result, function(void);

if ((result = function()).a == 10) puts(result.b);

be understandable, or even legal?

I will say that I have used the feature, but in very limited conditions,
mostly where the structure is no bigger than one or two typical words.

It could be a structure with a bitfield to make accesses clearer than
low level masking, or for a "point" with x and y tied into one object.

Bigger than that, and you likely want to pass the object by address, not
by value, passing just a pointer to it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Keith Thompson on Sat May 3 23:38:47 2025

On 5/3/25 20:37, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.

There is a caveat, to do with alignment padding: will this always have a
defined value?

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).

That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."

Note that, even when writing to a single member, the representations in
the padding bytes might be affected. A plausible reason for this to
happen would be, for example when a value is written to an 8-bit strujct
field followed by 8 bits of padding on a machine where the word size is
16 bits. The wording of that clause permits the use of instructions that
change the contents of an entire word to be used when updating that field.

Finally, why would you care?

The fact that an implementation does not have to do the equivalent of
memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Richard Damon on Sun May 4 11:01:17 2025

On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <[email protected]> wrote:

Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.

That sort of thinking is an example of Knutian premature optimization.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Sun May 4 08:34:11 2025

On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

That sort of thinking is an example of Knutian premature optimization.

Trying to hold back the optimization tide?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kenny McCormack@21:1/5 to [email protected] on Sun May 4 09:25:08 2025

In article <vv6ng8$1410m$[email protected]>,
James Kuyper <[email protected]> wrote:
...

The fact that an implementation does not have to do the equivalent of >memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().

Which then begs two questions:

1) Why wouldn't an implementaton do it with memcpy()? That is likely
to be as good or better than any other method, including, especially, a
member-by-member copy.

2) Why wouldn't you, the programmer, just use memcpy() instead of
struct assignment? Yes, I realize there are other cases to consider,
but in the simple one:

struct something foo,bar;
foo = bar;

memcpy() seems like it would always be easier and more reliable.

--
The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Pedantic

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Sun May 4 14:06:30 2025

On 04/05/2025 10:34, Lawrence D'Oliveiro wrote:

On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

That sort of thinking is an example of Knutian premature optimization.

Trying to hold back the optimization tide?

I think he meant Knuthian, rather than Knutian :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Andrey Tarasevich on Sun May 4 06:48:13 2025

Andrey Tarasevich <[email protected]> writes:

On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

Weird. Virtually every C project relies on assignment of
structures. Passing-returning structs by value might be more rare
(although perfectly valid and often appropriate too), but
assignment... assignment is used by everyone everywhere without even
giving it a second thought.

One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to James Kuyper on Sun May 4 14:27:01 2025

James Kuyper <[email protected]> writes:

On 5/3/25 20:37, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.

There is a caveat, to do with alignment padding: will this always have a >>> defined value?

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)" >(6.2.6.1p6).

That refers to footnote 56, which says "Thus, for example, structure >assignment need not copy any padding bits."

Are there any C implementations in common use that don't just
use memcpy or an optimized version thereof?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Lew Pitcher on Sun May 4 07:49:15 2025

Lew Pitcher <[email protected]> writes:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?

Would code like
struct ab {
int a;
char *b;
} result, function(void);

if ((result = function()).a == 10) puts(result.b);

be understandable, or even legal?

The style is unorthodox, but the code is understandable.

Also it is both legal and well-defined, back to and
including C90.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Sun May 4 18:45:32 2025

On 04/05/2025 16:27, Scott Lurndal wrote:

James Kuyper <[email protected]> writes:

On 5/3/25 20:37, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.

There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).

That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."

Are there any C implementations in common use that don't just
use memcpy or an optimized version thereof?

Sometimes small structs never make it to memory, or are handled by the
compiler as though they were individual variables (as long as that is
within "as-if" usage, of course). Copying a struct might merely mean
the compiler keeps track of the logical copy without actually copying
any memory. (You could argue that the compiler is still treating it
like memcpy, as memcpy calls don't always copy something.)

I think it would be unusual to see a significant difference between a
struct assignment copy and a memcpy on a compiler that optimises memcpy
well.

But on a compiler that does not handle memcpy well, then a struct
assignment could be inlined while a memcpy could mean an external
library call with significant overhead.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Keith Thompson on Sun May 4 21:08:43 2025

On 5/4/25 16:20, Keith Thompson wrote:

James Kuyper <[email protected]> writes:

On 5/3/25 20:37, Keith Thompson wrote:

...

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

...

Finally, why would you care?

The fact that an implementation does not have to do the equivalent of
memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().

Are you referring to checking whether an assignment was performed
or not, due to uncertainty about what the program has done? If you
mean doing an assignment and then checking whether it succeeded,
I can't think of a context where that makes sense.

Sorry, I didn't explain what I was thinking about in any detail. I've
seen code that allows a data structure to be modified by one section of
the code, and then periodically checks each object in that data
structure (including aggregate objects) to see whether it has been
modified by using memcmp() versus a saved copy. If so, it updated the
saved copy, including a timestamp when it was updated. If it weren't for
the need to keep track of the timestamp, it would always be simpler, and
not much slower, to always replace the saved copy, whether or not
there'd been a change.

I should have made it clear that I basically understand and agree with
your "why would you care" criticism. But it's part of my nature to look
for the edge cases where differences that ordinarily don't matter, could matter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Mon May 5 00:41:03 2025

Keith Thompson <[email protected]> writes:

James Kuyper <[email protected]> writes:

On 5/3/25 20:37, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.

There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).

That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."

Yes, that's what I missed.

It's interesting that the footnote refers to padding *bits* rather than >padding *bytes*. I presume this was unintentional.

Padding bits:

struct A {
uint64_t tlen : 16,
: 20,
pkind : 6,
fsz : 6,
gsz : 14,
g : 1,
ptp : 1;
} s;

There are 20 padding bits in this declaration. Perhaps that's
what they're referring to?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Tim Rentsch on Sun May 4 22:22:12 2025

On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion (and
use `[]` operator), lvalues and addresses quickly come into the picture.

The standard says in 6.2.4/8:

"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its
initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression ends. [...] Such an object
need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.

And when I make the following experiment with GCC and Clang

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

I consistently get the following output from GCC

0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

And this is what I get from Clang

0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

As you can see, GCC apparently took C++-like approach to this situation.
The returned "temporary" is not really a separate temporary at all, but actually `a` itself.

Meanwhile, in Clang all three pointers are different, i.e. Clang decided
to actually create a separate temporary object for the result of the assignment.

I have a strong feeling that GCC's behavior is non-conforming. The last sentence of 6.2.4/8 is not supposed to permit "projecting" the resultant temporaries onto existing named objects. I could be wrong...

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 11:12:13 2025

On Sun, 4 May 2025 22:22:12 -0700
Andrey Tarasevich <[email protected]> wrote:

On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.

The standard says in 6.2.4/8:

"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.

And when I make the following experiment with GCC and Clang

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

I consistently get the following output from GCC

0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

And this is what I get from Clang

0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.

Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.

I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Mon May 5 08:50:39 2025

On Sat, 3 May 2025 11:46:30 +0200
David Brown <[email protected]> gabbled:

On 02/05/2025 20:34, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.

If you twant o pass an actual array to a function instead of a pointer to it, embedding it in a structure is the only way to do it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 01:29:47 2025

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers. This expression

(a = b).a[5]

is already doing your "taking pointers of non-lvalue" (if I understood
you correctly) as part of array-to-pointer conversion. And no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

So, what you are basing your "UB" claim on is not clear to me.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 12:01:45 2025

On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers.
This expression

(a = b).a[5]

is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

That is not UB:
int a5 = (a = b).a[5];

That is UB:
int* pa5 = &(a = b).a[5];

So, what you are basing your "UB" claim on is not clear to me.

If you read the post of Keith Thompson and it is still not clears to
you then I can not help.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Keith Thompson on Mon May 5 12:03:31 2025

On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:

And more obviously, "%p" requires an argument of type void*, not int*.

That part of otherwise very good comment is unreasonably pedantic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kenny McCormack@21:1/5 to [email protected] on Mon May 5 11:30:43 2025

In article <[email protected]>,
Michael S <[email protected]> wrote:

On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:

And more obviously, "%p" requires an argument of type void*, not int*.

That part of otherwise very good comment is unreasonably pedantic.

That's KT for you. That's his reason for existence.

Welcome to CLC!

--
Alice was something of a handful to her father, Theodore Roosevelt. He was once asked by a visiting dignitary about parenting his spitfire of a daughter and he replied, "I can be President of the United States, or I can control Alice. I cannot possibly do both."

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to [email protected] on Mon May 5 13:34:45 2025

On 05/05/2025 10:50, [email protected] wrote:

On Sat, 3 May 2025 11:46:30 +0200
David Brown <[email protected]> gabbled:

On 02/05/2025 20:34, Lew Pitcher wrote:

Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978) >>> in which they detailed some differences in the C language post "The
C Programming Language".

The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."

From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.

I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.

If you twant o pass an actual array to a function instead of a pointer
to it,
embedding it in a structure is the only way to do it.

Yes.

(Well, you could embed it in a union if you prefer, but a struct seems
more likely.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:14:17 2025

Michael S <[email protected]> writes:

On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers.
This expression

(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

That is not UB:
int a5 = (a = b).a[5];

That is UB:
int* pa5 = &(a = b).a[5];

So, what you are basing your "UB" claim on is not clear to me.

If you read the post of Keith Thompson and it is still not clears to
you then I can not help.

Under C11 semantics, both

int a5 = (a = b).a[5];

and

int* pa5 = &(a = b).a[5];

have well-defined behavior. The undefined behavior of the
upthread example comes later, only after the statement assigning
to the pointer (or here, initializing) completes. It isn't
taking the address with & that has undefined behavior; it is
using the stored pointer value in a subsequent statement, /after
the full expression containing the & operator has completed/,
that results in undefined behavior.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:03:40 2025

Michael S <[email protected]> writes:

On Sun, 4 May 2025 22:22:12 -0700
Andrey Tarasevich <[email protected]> wrote:

On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.

The standard says in 6.2.4/8:

"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.

And when I make the following experiment with GCC and Clang

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

I consistently get the following output from GCC

0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

And this is what I get from Clang

0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.

Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.

I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Maybe you are thinking of C90.

In both C99 and C11, the expression

(a = b).a[5]

is an lvalue, so taking its address with & is allowed.

It's easy to verify this assertion using gcc -std=c99 -pedantic. If
the given expression were not an lvalue then taking its address with
& would be a constraint violation, requiring a diagnostic. But no
diagnostic is produced. (Using clang in place of gcc also produces
no diagnostic.)

The behavior under C99 semantics is arguably murky. But under C11
semantics the behavior is well-defined.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 06:34:49 2025

Keith Thompson <[email protected]> writes:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

[...]

I think that code has undefined behavior.

Right. [*]

(a = b) is an rvalue that refers to an object of type struct S with
temporary lifetime. pc holds the address of a subobject of that
temporary object. The object reaches the end of its lifetime at the end
of the evaluation of the full expression. You then print its value.

Even if the printf() statement were replaced by

(void)pc;

the behavior would be undefined, because the pointer held in pc
becomes indeterminate as soon as the statement containing the
assignment to pc completes.

[*] Assuming C11 semantics. At best inadvisable under C99
semantics, and a constraint violation under C90 semantics.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Andrey Tarasevich on Mon May 5 07:56:40 2025

Andrey Tarasevich <[email protected]> writes:

On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.

The standard says in 6.2.4/8:

"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

The last sentence there is not present in N1570. Apparently it was
introduced later, in C17. (My appreciation to Keith Thompson for
reporting this.)

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

Ahh, I see now what your concern is.

But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.

My reading of the post-C11 standards is that they allow the "new"
object to overlap with already existing objects, including both
declared objects and objects whose storage was allocated using
malloc().

And when I make the following experiment with GCC and Clang

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

I consistently get the following output from GCC

0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

And this is what I get from Clang

0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.

Yeah.

Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.

Which in my reading of the standard is required under C11 rules.
I have reproduced your results under -std=c11 -pedantic, for both
gcc and clang.

I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...

My judgment is that the behavior under gcc is non-conforming if the
compilation was done using C11 semantics. Under C17 or later rules
the gcc behavior is allowed (and may have been what prompted the
change in C17, but that is just speculation on my part). In any
case I understand now what you were getting at. Thank you for
bringing this hazard to the group's attention.

I hope someone files a bug report for gcc using -std=c11 rules,
because what gcc does under that setting (along with -pedantic)
is surely at odds with the plain reading of the C11 standard,
for the situation being discussed here.

Editorial comment: here is yet another case where post-C11 changes
to the C standard seem ill advised, and another reason not to use
any version of the ISO C standard for C17 or later. And it's
disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 08:45:09 2025

On Mon 5/5/2025 2:01 AM, Michael S wrote:

On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers.
This expression

(a = b).a[5]

is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

That is not UB:
int a5 = (a = b).a[5];

That is UB:
int* pa5 = &(a = b).a[5];

No, it isn't.

If you read the post of Keith Thompson and it is still not clears to
you then I can not help.

The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of conversion
to `void *` is an issue).

As for the expressions like

&(a = b).a[5];

and

&foo().a[2]

- these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

Here's a version of the same code that corrects the above distracting issues

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Keith Thompson on Mon May 5 10:14:40 2025

On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.

It is going to cause problems, if the code relies on the address
identity of the object, assuming the standard intends to provide such guarantees.

Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.

That's easy (in the context of declarations from my previous example):

pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

As one would expect, this produces different output in GCC and Clang for
the reasons I already described.

But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:

An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.

Normally any two objects with overlapping lifetime must have distinct addresses. This addition, I think, gives compilers permission to have temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).

If so, that would be extremely underspecified. A mere "such an object
need not have a unique address" is insufficient to fully convey the
permission to overlap existing named objects. And that's probably what
led to difference in interpretation between GCC and Clang.

Modification of the temporary is "prohibited" (as UB), but modification
of the overlapped named object is not. The consequences can be quite surprising.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Tim Rentsch on Mon May 5 20:00:39 2025

On 05/05/2025 16:56, Tim Rentsch wrote:

Andrey Tarasevich <[email protected]> writes:

On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

One dark corner this feature has, is that in C (as opposed to C++) the >>>> result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.

I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?

I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.

The standard says in 6.2.4/8:

"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its
initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

The last sentence there is not present in N1570. Apparently it was introduced later, in C17. (My appreciation to Keith Thompson for
reporting this.)

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

Ahh, I see now what your concern is.

But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.

My reading of the post-C11 standards is that they allow the "new"
object to overlap with already existing objects, including both
declared objects and objects whose storage was allocated using
malloc().

And when I make the following experiment with GCC and Clang

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

I consistently get the following output from GCC

0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

And this is what I get from Clang

0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.

Yeah.

Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.

Which in my reading of the standard is required under C11 rules.
I have reproduced your results under -std=c11 -pedantic, for both
gcc and clang.

Compilers don't have to follow the behaviour specified by the standard
in a "direct translation" manner in order to be correct and conforming.
They have to generate code that in the absence of any attempt to execute something with undefined behaviour, will give the same observable
behaviour as a "direct translation" would.

The result of the "(a = b)" expression should be a temporary object
distinct from "a" and "b", with a lifetime extending only to the end of
the expression assigning to "pc" (prior to C17).

Is there any way to distinguish between "pc" pointing to an int inside
this now dead temporary object, and it pointing to an int inside "a",
without invoking undefined behaviour?

By the time you are using "pc" to print it, the pointer itself has an indeterminate value - the compiler can quite happily give it the same
value as "pa", so looking at the pointer in the printf() statement does
not show a non-conformance.

Attempting to modify the temporary lifetime object, such as by writing
"*(pc = &(a = b).a[5]) = 42;", is undefined behaviour.

It is entirely possible that there /is/ some way to determine that the
compiler is not making a distinct temporary object while avoiding any
undefined behaviour or indeterminate values. But I don't think the code
here does show that - and it is therefore not an example of
non-conforming behaviour. I think GCC and clang can be viewed as having
simply picked different ways to generate their indeterminate values.

I will be happy to change that opinion if someone has a better argument
or example.

I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...

My judgment is that the behavior under gcc is non-conforming if the compilation was done using C11 semantics. Under C17 or later rules
the gcc behavior is allowed (and may have been what prompted the
change in C17, but that is just speculation on my part). In any
case I understand now what you were getting at. Thank you for
bringing this hazard to the group's attention.

I hope someone files a bug report for gcc using -std=c11 rules,
because what gcc does under that setting (along with -pedantic)
is surely at odds with the plain reading of the C11 standard,
for the situation being discussed here.

Editorial comment: here is yet another case where post-C11 changes
to the C standard seem ill advised, and another reason not to use
any version of the ISO C standard for C17 or later. And it's
disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 20:20:38 2025

On Mon, 5 May 2025 08:45:09 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 2:01 AM, Michael S wrote:

On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers.
This expression

(a = b).a[5]

is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

That is not UB:
int a5 = (a = b).a[5];

That is UB:
int* pa5 = &(a = b).a[5];

No, it isn't.

If you read the post of Keith Thompson and it is still not clears to
you then I can not help.

The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).

As for the expressions like

&(a = b).a[5];

and

&foo().a[2]

Expressions by themselves a valid. But since there is no situation in
which the value produced by expressions is valid outside of expressions
the compiler can generate any value it wants, even NULL or value
completely outside of address space of current process.

- these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

Here's a version of the same code that corrects the above distracting
issues

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Mon May 5 21:10:07 2025

On 2025-05-05, Keith Thompson <[email protected]> wrote:

Michael S <[email protected]> writes:

On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:

And more obviously, "%p" requires an argument of type void*, not int*.

That part of otherwise very good comment is unreasonably pedantic.

I disagree. I suggest it's a bad habit to use "%p" without ensuring,
by a cast if necessary, that the argument is of type void*.

In most implementations, it's likely that all pointers have the same
size and representation and are passed as arguments in the same way,
but getting the types right means one less thing to worry about.

If the codebade assumes all data pointers are the same size, bit pattern
and are treated the same in the calling conventions / ABI, then it
is probably moot.

That code is doomed on a platform where the assumption doesn't hold, and
the printf statemnts are probably not independently reusable.

(I mostly put in these casts just to communicate to others that
an ISO C language lawyer works here, if you happen to need one.)

Also, it owuld be amazingly stupid of any such platform not just
make those printfs work: to promote variadic arguments of
pointer-to-object type to a common representation which is the same as
void *, combined with a matching behavior in the va_arg macro for
extracting the value back into any pointer-to-object type.

Mountains of non-standard-conforming code exert tremendous pressure on
both hardware platforms and the way C implementations are adapted to
those platforms.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @[email protected]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 17:04:06 2025

Keith Thompson <[email protected]> writes:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. [...]

If you look again carefully, I expect you will reach a
different conclusion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Scott Lurndal on Mon May 5 21:57:47 2025

[email protected] (Scott Lurndal) writes:

Keith Thompson <[email protected]> writes:

James Kuyper <[email protected]> writes:

On 5/3/25 20:37, Keith Thompson wrote:

Lawrence D'Oliveiro <[email protected]d> writes:

On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving
it a second thought.

There is a caveat, to do with alignment padding: will this
always have a defined value?

I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.

"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values.56)" (6.2.6.1p6).

That refers to footnote 56, which says "Thus, for example,
structure assignment need not copy any padding bits."

Yes, that's what I missed.

It's interesting that the footnote refers to padding *bits* rather
than padding *bytes*. I presume this was unintentional.

Padding bits:

struct A {
uint64_t tlen : 16,
: 20,
pkind : 6,
fsz : 6,
gsz : 14,
g : 1,
ptp : 1;
} s;

There are 20 padding bits in this declaration. Perhaps that's
what they're referring to?

To me it seems clear that the "padding bits" here is meant to refer
to all of the following:

unoccupied bytes between members, due to member alignment
unoccupied bytes at the end of a structure or union
bits corresponding to unnamed bit-field members
unoccupied bits or bytes caused by explicit bit-field alignment
unoccupied bits or bytes caused by other bit-field alignment

Any member objects may have their own internal padding bits. Any
assignment of a struct or union follows the usual rule that any
padding bits that are part of a target member have unspecified
values (as long as the member doesn't become a trap representation
as a result).

Considering all these parts together, I think it makes sense to say
that the padding bits of an object are those bits that do not
participate in determining the abstract value of the object (not
counting that some combination of padding bits might cause the
object to become a trap representation, which never happens for
structs or unions).

(Yes I know that the term "trap representation" has been changed in
later versions of the C standard. Please make any needed editorial
changes internally, without having to post a followup.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon May 5 21:25:16 2025

Michael S <[email protected]> writes:

On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <[email protected]> wrote:

Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.

That sort of thinking is an example of Knutian premature optimization.

I don't agree with this assessment. First, the given suggestion is
a rule of thumb. By their nature rules of thumb offer heuristics
that give guidelines likely to yield good results, but not
guaranteed to do so. Second, a decision about whether to pass a
struct object or a pointer to said object is often one that is a
fair amount of work to undo, and so tends to be made early during
the time period of program development. As such, it is useful to
follow a guideline likely to give good results, even if not always
optimal, because on average it will mean less work done overall.

I second Richard Damon's recommendation, with the understanding that
it is only a guideline, not an absolute, and as always subject to
later revision should that turn out to be called for (no pun
intended).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:40:57 2025

Michael S <[email protected]> writes:

On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:

And more obviously, "%p" requires an argument of type void*, not
int*.

That part of otherwise very good comment is unreasonably pedantic.

I don't have the same reaction. My sense is Keith was just being
thorough. Speaking for myself his statement wasn't needed, but
that condition might not hold for other readers. Given that his
comment is just one not-overly-long sentence, I don't think it's
too much to ask that readers already familiar with the point
simply skip over it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:26:22 2025

Michael S <[email protected]> writes:

On Mon, 5 May 2025 08:45:09 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 2:01 AM, Michael S wrote:

On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:

On Mon 5/5/2025 1:12 AM, Michael S wrote:

According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.

Er... What? What specifically do you mean by "taking pointers"?

The whole functionality of `[]` operator in C is based on pointers.
This expression

(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.

This is not UB either

struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);

That is not UB:
int a5 = (a = b).a[5];

That is UB:
int* pa5 = &(a = b).a[5];

No, it isn't.

If you read the post of Keith Thompson and it is still not clears to
you then I can not help.

The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).

As for the expressions like

&(a = b).a[5];

and

&foo().a[2]

Expressions by themselves a valid. But since there is no situation in
which the value produced by expressions is valid outside of expressions
the compiler can generate any value it wants, even NULL or value
completely outside of address space of current process.

These expressions produce valid values as long as they are used
before the end of each full expression containing the given
expression; within that context they may not produce NULL or a
value outside of the program's address space.

- these by themselves are are perfectly valid. There's no UB in these
expressions. (And this is not a debate.)

Here's a version of the same code that corrects the above distracting
issues

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.

The term used in the C standard is "unspecified behavior". If
this kind of expression is something you don't want to use that
is understandable, but it would help communication to use the
appropriate standard-defined term to describe it.

Essentially all non-trivial programs have unspecified behaviors, and
plenty of them. Most are benign, some are problematic, but in no
case does an unspecified behavior, by itself, represent a danger to
program semantics as severe as executing a construct that has
undefined behavior.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Mon May 5 22:57:14 2025

Kaz Kylheku <[email protected]> writes:

On 2025-05-05, Keith Thompson <[email protected]> wrote:

Michael S <[email protected]> writes:

On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:

And more obviously, "%p" requires an argument of type void*, not
int*.

That part of otherwise very good comment is unreasonably pedantic.

I disagree. I suggest it's a bad habit to use "%p" without
ensuring, by a cast if necessary, that the argument is of type
void*.

In most implementations, it's likely that all pointers have the
same size and representation and are passed as arguments in the
same way, but getting the types right means one less thing to worry
about.

If the codebade assumes all data pointers are the same size, bit
pattern and are treated the same in the calling conventions / ABI,
then it is probably moot.

That code is doomed on a platform where the assumption doesn't
hold, and the printf statemnts are probably not independently
reusable.

(I mostly put in these casts just to communicate to others that
an ISO C language lawyer works here, if you happen to need one.)

Also, it owuld be amazingly stupid of any such platform not just
make those printfs work: to promote variadic arguments of
pointer-to-object type to a common representation which is the
same as void *, combined with a matching behavior in the va_arg
macro for extracting the value back into any pointer-to-object
type.

This statement strikes me as would an utterance coming from a
resident of Fantasyland.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Tue May 6 07:16:24 2025

On Mon, 05 May 2025 13:53:10 -0700
Keith Thompson <[email protected]> wibbled:

[email protected] writes:
[...]

If you twant o pass an actual array to a function instead of a pointer to it,

embedding it in a structure is the only way to do it.

Yes, but that's not necessarily useful. An array that's a member

Depends what you're doing. Passing an array in a structure will copy the array saving you having to do it yourself if you don't want to work on the original version. Obviously that doesn't happen if you just pass a pointer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:46:21 2025

On 05/05/2025 22:53, Keith Thompson wrote:

[email protected] writes:
[...]

If you twant o pass an actual array to a function instead of a pointer to it,
embedding it in a structure is the only way to do it.

Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.

I regularly use arrays with known fixed sizes. In fact, in my code
those are absolutely dominant - it is very rare for me to see or use an
array whose size is /not/ fixed at compile time. Sometimes I will have
general functions that take parameters that are arrays of arbitrary
length, but not often.

So this is very much dependent on the kind of code you are working with,
and other people will have very different experiences for their own code.

However, I think it is not unlikely that people will see use of structs
like :

struct vector4int { int vs[4]; };

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:35:30 2025

On 05/05/2025 22:27, Keith Thompson wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

It seems clear to me that "pc" has an indeterminate value after the
expression assigning, since it points to an object with temporary lifetime.

And attempting to use the value of an object with automatic storage
while it has an indeterminate value is undefined behaviour.

As far as I can see, simply reading the value in "pc" to print it out is
UB according to the C standards. It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
registers are more complicated than simple linear addresses - they can
track some kind of segment structure describing the range of a data
block, or permissions for access to the data, and such structures could
have been deactivated or deallocated when the temporary lifetime object
died. Even attempting to read the value of the pointer, without
dereferencing it, would then cause some kind of fault or trap.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Tue May 6 10:18:44 2025

On Tue, 6 May 2025 11:46:21 +0200
David Brown <[email protected]> wibbled:

On 05/05/2025 22:53, Keith Thompson wrote:

[email protected] writes:
[...]

If you twant o pass an actual array to a function instead of a pointer to >it,
embedding it in a structure is the only way to do it.

Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.

I regularly use arrays with known fixed sizes. In fact, in my code
those are absolutely dominant - it is very rare for me to see or use an
array whose size is /not/ fixed at compile time. Sometimes I will have

I do a lot of networking code and with packet structures the arrays are
almost always of fixed size. Also with arrays the data is inline so a simple memcpy() can copy the data from the struct to the output buffer. You can't
do that if you have pointers in the struct. Ditto a simple cast to char * to use it directly as the ouput.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Keith Thompson on Tue May 6 16:34:56 2025

On Mon, 05 May 2025 13:53:10 -0700
Keith Thompson <[email protected]> wrote:

[email protected] writes:
[...]

If you twant o pass an actual array to a function instead of a
pointer to it, embedding it in a structure is the only way to do
it.

Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.

It seems, C++ authorities were feeling that the pattern "struct with
array of constant length as an only member" is very common.
Otherwise they wouldn't bother to add <array> to their standard library.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Keith Thompson on Tue May 6 17:36:36 2025

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Tue May 6 20:46:48 2025

On 06/05/2025 19:36, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

I must admit I had not noticed that detail.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Nick Bowler@21:1/5 to Keith Thompson on Tue May 6 19:06:20 2025

On Mon, 05 May 2025 13:43:31 -0700, Keith Thompson wrote:

Tim Rentsch <[email protected]> writes:

Keith Thompson <[email protected]> writes:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];

printf("%p %p %p\n", pa, pb, pc);
}

[...]

I think that code has undefined behavior.

Right. [*]

[...]

[*] Assuming C11 semantics. At best inadvisable under C99
semantics, and a constraint violation under C90 semantics.

What C90 constraint does it violate? Both gcc and clang reject it
with "-std=c90 -pedantic-errors", with an error message "ISO C90
forbids subscripting non-lvalue array", but I don't see a relevant
constraint in the C90 standard.

I don't know about C90, but in C89 the above code violates the
constraint on the [] operator that "one of the expressions shall
have type ``pointer to object type.''" (3.3.2.1, first paragraph)

C89 (3.2.2.1, third paragraph) only describes conversion of lvalues with
array type into pointers. No similar rule applies for an expression
with array type which is not an lvalue, so such expressions are not
converted to pointers.

So, given:

struct { int a[10]; } a, b;
/* ... */
(a = b).a[5];

Since (a = b).a is not an lvalue, it is not converted to a pointer, so
neither operand of [] has pointer type, so a diagnostic is required.

I know that C11 introduced "temporary lifetime" to cover cases
like this. In C99, the wording for the indexing operator implicitly
assumes that there's an array object; if there isn't, I'd argue the
behavior is undefined by omission. I'm not aware of any relevant
change from C90 to C99.

The rule about conversions from arrays to pointers is different in C99
(n1124 6.3.2.1, third paragraph) compared to C89. In particular,
"an lvalue that has type ``array of type'' ..." was changed to
"an expression that has type ``array of type'' ...".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Tue May 6 19:22:34 2025

David Brown <[email protected]> writes:

On 06/05/2025 19:36, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

I must admit I had not noticed that detail.

That would get an immediate downcheck during review for exactly
that reason.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Wed May 7 09:37:57 2025

On 06/05/2025 21:22, Scott Lurndal wrote:

David Brown <[email protected]> writes:

On 06/05/2025 19:36, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

I must admit I had not noticed that detail.

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or
smart-arse coding. In all my C coding experience, I can't recall ever
coming across a single situation when I thought the use of the comma
operator was appropriate in the kind of code I work with.

Other people, projects, and teams work with different standards,
different requirements, and different kinds of code - there are a lot of
things that are common practice in some C coding that are strongly
rejected in my field (and perhaps vice versa). So I am not suggesting
that the comma operator is always bad in C - just that it is pretty much
always bad in my line of work.

And of course Andrey was using it here to make a specific point in a
discussion about C details, rather than real-life code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Nick Bowler@21:1/5 to Keith Thompson on Wed May 7 19:09:40 2025

On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:

Nick Bowler <[email protected]> writes:

The rule about conversions from arrays to pointers is different in C99
(n1124 6.3.2.1, third paragraph) compared to C89. In particular,
"an lvalue that has type ``array of type'' ..." was changed to
"an expression that has type ``array of type'' ...".

[...]

The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.

It's not mentioned in the rationale, so we can only guess. But it is
called out in the list of major changes in the C99 foreword.

BTW, you have a copy of ANSI C89? Hard or soft copy? Do you know if
it's still available in some form?

Hint: look for FIPS 160 on the NIST website. This is the same standard
as ANSI X3.159-1989 Programming Language - C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Nick Bowler on Wed May 7 21:17:17 2025

Nick Bowler <[email protected]> writes:

On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:

Nick Bowler <[email protected]> writes:

The rule about conversions from arrays to pointers is different
in C99 (n1124 6.3.2.1, third paragraph) compared to C89. In
particular, "an lvalue that has type ``array of type'' ..." was
changed to "an expression that has type ``array of type'' ...".

[...]

The change from "lvalue" to "expression" was made in C99. I
wonder why that was done.

It's not mentioned in the rationale, so we can only guess. [...]

To me it seems obvious. The change in C99 was meant to allow
access to an array inside a non-lvalue struct. When C99 was
done the committee didn't realize all the ramifications of
accessing non-value structs (which apparently has problems
even for scalar members, not just array members). Later, when
they did realize the resulting problems, they fixed things up
in C11.

See also n1253.htm, by Clark Nelson.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Nick Bowler@21:1/5 to Keith Thompson on Thu May 8 12:58:56 2025

On Wed, 07 May 2025 14:23:57 -0700, Keith Thompson wrote:

Nick Bowler <[email protected]> writes:

On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:

The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.

It's not mentioned in the rationale, so we can only guess. But it is
called out in the list of major changes in the C99 foreword.

I've just looked at the foreword of the C99 standard and the n1256
draft, and I couldn't find it. Can you quote the precise wording?

N1256 page xiii. Fourth to last bullet point:

"-- conversion of array to pointer not limited to lvalues"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Andrey Tarasevich on Thu May 8 12:45:58 2025

Andrey Tarasevich <[email protected]> writes:

On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.

You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.

It is going to cause problems, if the code relies on the address
identity of the object, assuming the standard intends to provide such guarantees.

Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.

That's easy (in the context of declarations from my previous example):

pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

As one would expect, this produces different output in GCC and Clang
for the reasons I already described.

But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:

An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.

Normally any two objects with overlapping lifetime must have distinct
addresses. This addition, I think, gives compilers permission to have
temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).

If so, that would be extremely underspecified. A mere "such an object
need not have a unique address" is insufficient to fully convey the permission to overlap existing named objects.

I don't see why you say that. The statement says objects with
temporary lifetime need not have a unique address. In the absence
of any other statement on the subject, this statement admits the
inference that an object with temporary lifetime might have the same
address as any other object. Removing the constraint (that the
addresses of those objects must be distinct from the addresses
of all other objects), /and doing nothing else/, can only mean that
the addresses of such objects might match the address of any other
object in the environment.

If you think there should be a non-normative footnote explaining
that point, I expect I would vote in favor of that, but as far
as normative text goes I don't see any fuzziness about what is
allowed under the existing wording.

And that's probably what led to difference in interpretation
between GCC and Clang.

I suspect the implication actually goes the other way. It is
because what gcc has done (past tense) violates the rules of the C11
standard that someone had the bright idea that the C standard should
be changed to allow this stupidity.

Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.

In my view the problem is not that what is allowed is unclear, but
that the whole idea of possibly overlapping objects is a crock.
It's a sad statement on the quality of gcc that it does the wrong
thing even when -std=c11 and -pedantic are given as compilation
options. Bleah.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Tim Rentsch on Thu May 8 22:20:02 2025

On 08/05/2025 21:45, Tim Rentsch wrote:

Andrey Tarasevich <[email protected]> writes:

And that's probably what led to difference in interpretation
between GCC and Clang.

I suspect the implication actually goes the other way. It is
because what gcc has done (past tense) violates the rules of the C11
standard that someone had the bright idea that the C standard should
be changed to allow this stupidity.

Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.

In my view the problem is not that what is allowed is unclear, but
that the whole idea of possibly overlapping objects is a crock.
It's a sad statement on the quality of gcc that it does the wrong
thing even when -std=c11 and -pedantic are given as compilation
options. Bleah.

While I think it is important that compilers try to follow the C
standards (at least when you specify conforming modes), are there any
potential realistic consequences of this?

Posters here have gone far out of their way to make hypothetical code
that demonstrates this flaw in gcc without invoking undefined behaviour.
Is there any risk that anyone would come across this in real code?

In addition, is it reasonable to suppose that C programmers that have
not studied the C standards here would be expecting the behaviour of
gcc, or the behaviour of clang here? Certainly if /I/ saw "pc = &(a =
b).a[5]" prior to this thread, I would expect the contents of the struct
"b" to be copied to the memory of the struct "a", and "pc" set to point
to the member of the array within "a". I would expect the code to work
as gcc works, and would find clang's behaviour completely unexpected. I
would be surprised if I were alone in that.

So to me, it makes sense that the C standard has changed to support a
more sane approach to such situations. It would be unreasonable to
change it to guarantee the sensible behaviour - that would mean
compilers like clang that generated technically correct but surprising
(to many) code would now be wrong.

(gcc's behaviour is also more efficient, but of course correctness
trumps efficiency every time.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rosario19@21:1/5 to Michael S on Mon May 12 11:23:02 2025

On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <[email protected]> wrote:

Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.

That sort of thinking is an example of Knutian premature optimization.

i prefer pass memory (if it is big enought) with one address or
reference

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Michael S on Thu May 29 05:11:01 2025

On Mon 5/5/2025 10:20 AM, Michael S wrote:

Here's a version of the same code that corrects the above distracting
issues

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.

Whether we can or cannot predict the values of `(pa==pc)` and `(pb==pc)`
has very little impact on the usability of such expressions. The
practical usability of such expressions is very high without relying on `(pa==pc)` and `(pb==pc)`.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Keith Thompson on Thu May 29 05:14:37 2025

On Mon 5/5/2025 1:27 PM, Keith Thompson wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

Nope. Nowhere in this code the value of `pc` is used beyond the lifetime
of the object with temporary lifetime.

Pay attention to the fact that the last 4 lines in above code is a
single expression joined by a comma operator, which is the whole point
of the corrections that differentiate it from the original version.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Not applicable in this case.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Waldek Hebisch on Thu May 29 05:21:04 2025

On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

Exactly. I thought the nature of the corrections I made (i.e. the
deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to David Brown on Thu May 29 05:19:13 2025

On Tue 5/6/2025 2:35 AM, David Brown wrote:

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

It seems clear to me that "pc" has an indeterminate value after the expression assigning, since it points to an object with temporary lifetime.

And attempting to use the value of an object with automatic storage
while it has an indeterminate value is undefined behaviour.

Again, this makes no sense. Please, pay attention to the code and the corrections made after the initial version (e.g. usage of comma
operator). No, the value of `pc` is not indeterminate, and no, there's
no undefined behavior in the above version of the code.

As far as I can see, simply reading the value in "pc" to print it out is
UB according to the C standards. It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
registers are more complicated than simple linear addresses - they can
track some kind of segment structure describing the range of a data
block, or permissions for access to the data, and such structures could
have been deactivated or deallocated when the temporary lifetime object died. Even attempting to read the value of the pointer, without dereferencing it, would then cause some kind of fault or trap.

Again, irrelevant. In the above code the temporary object does not die
during the entire period when pointer `pc` is used in any way.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to Keith Thompson on Thu May 29 05:36:34 2025

On Mon 5/5/2025 1:43 PM, Keith Thompson wrote:

What C90 constraint does it violate? Both gcc and clang reject it
with "-std=c90 -pedantic-errors", with an error message "ISO C90
forbids subscripting non-lvalue array", but I don't see a relevant
constraint in the C90 standard.

The "constraint" in C89/90 is simply the fact that C89/90 _requires_ an
lvalue (of array type) in order to apply array to pointer conversion.
Here's is the original wording:

Except when it is the operand of the sizeof operator or the unary & operator, or is a character string literal used to initialize an array
of character type, or is a wide string literal used to initialize an
array with element type compatible with wchar-t, an *lvalue* that has
type “array of type” is converted to an expression that has type
“pointer to rype” that points to the initial element of the array object and is not an lvalue.

The presence of that "*lvalue*" requirement is what prevented up from
using `[]` operator on non-lvalue arrays in C89/90, because `[]`
critically relies on that conversion.

In C11 the wording has changed:

Except when it is the operand of the sizeof operator, the _Alignof
operator, or the unary & operator, or is a string literal used to
initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the
array object has register storage class, the behavior is undefined.

Note that the "lvalue" requirement has disappeared from this wording.
That is exactly why since C99 we can apply `[]` to non-lvalue arrays.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andrey Tarasevich@21:1/5 to David Brown on Thu May 29 05:49:00 2025

On Wed 5/7/2025 12:37 AM, David Brown wrote:

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or smart-
arse coding. In all my C coding experience, I can't recall ever coming across a single situation when I thought the use of the comma operator
was appropriate in the kind of code I work with.

Wow! That's catastrophically bad.

As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.

The proclivity to stick exclusively to statement-level programming in C
and, God forbid, impose it in others through so called "code reviews"...
that would be a trait specific to "sweatshop" development outfits, which
strive to replace quality with quantity. I'd agree that in a revolving
door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
why we should set our standards that low here, in `comp.lang.c`.

--
Best regards,
Andrey

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Andrey Tarasevich on Thu May 29 16:43:24 2025

On 29.05.2025 14:21, Andrey Tarasevich wrote:

On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

Exactly. I thought the nature of the corrections I made (i.e. the
deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

I wouldn't call it "strikingly obvious". Typically programmers have a abstracting look at code, and if they're used to semicolon separated
commands the small difference between ';' and ',' may get missed (as
some replies also indicated). Myself, as I recall from that older post,
I did also miss it on the first glimpse. But only and after I asked
myself what the _intentions_ the post had been I had a closer look at
all the inconspicuous "details" and noticed the subtle difference.

I don't think there's anything wrong with it, to be sure. But if I
were to post such subtle differences I'd have added a comment or used
a formatting to point and hint to that subtle but crucial difference.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Andrey Tarasevich on Thu May 29 16:33:01 2025

On 29.05.2025 14:49, Andrey Tarasevich wrote:

On Wed 5/7/2025 12:37 AM, David Brown wrote:

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.

Wow! That's catastrophically bad.

As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.

The proclivity to stick exclusively to statement-level programming in C
and, God forbid, impose it in others through so called "code reviews"...
that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
why we should set our standards that low here, in `comp.lang.c`.

Well said.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Michael S on Thu May 29 12:57:06 2025

On Mon 5/5/2025 10:20 AM, Michael S wrote:

Here's a version of the same code that corrects the above distracting
issues

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

It's only not UB in the nazal demons sense.

UB is UB only in the nasal demons sense. UB means that "This
international standard imposes no requirements on the behavior". That
is, anything could happen, at least as far as the standard is concerned.
Code with undefined behavior should not be able to produce nasal demons
because there's no such thing as nasal demons (I think). However, if
they did exist, producing them would not violate any requirements
imposed by the standard, because it quite explicitly imposes none on
such code.

It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). ...

Why not? Because of the comma operators, the lifetime of the temporary
extends all the way till the end of the printf() call, long enough to
make use of pc in that call safe.

... I.e. pc is completely useless. In my book
it is form of UB.

If the problem were only that there's no restrictions on the value of an expression, but that the code is otherwise safe to use, that would be
indicated by a much weaker term: "unspecified value". Calling it "a
form of UB" would serve no useful purpose.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Andrey Tarasevich on Thu May 29 21:05:37 2025

On 29/05/2025 14:19, Andrey Tarasevich wrote:

On Tue 5/6/2025 2:35 AM, David Brown wrote:

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

It seems clear to me that "pc" has an indeterminate value after the
expression assigning, since it points to an object with temporary
lifetime.

And attempting to use the value of an object with automatic storage
while it has an indeterminate value is undefined behaviour.

Again, this makes no sense. Please, pay attention to the code and the corrections made after the initial version (e.g. usage of comma
operator). No, the value of `pc` is not indeterminate, and no, there's
no undefined behavior in the above version of the code.

As far as I can see, simply reading the value in "pc" to print it out
is UB according to the C standards. It is clearly going to be a
harmless operation on most hardware, but there are processors where
pointer registers are more complicated than simple linear addresses -
they can track some kind of segment structure describing the range of
a data block, or permissions for access to the data, and such
structures could have been deactivated or deallocated when the
temporary lifetime object died. Even attempting to read the value of
the pointer, without dereferencing it, would then cause some kind of
fault or trap.

Again, irrelevant. In the above code the temporary object does not die
during the entire period when pointer `pc` is used in any way.

I posted later that I had made a mistake, and not noticed the use of the
comma instead of a semicolon.

Why are you dredging up an outdated thread? I made an error three weeks
ago, and realised the mistake shortly afterwards. What do you expect me
or anyone else to learn from that now, after all this time?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Andrey Tarasevich on Thu May 29 21:20:59 2025

On 29/05/2025 14:49, Andrey Tarasevich wrote:

On Wed 5/7/2025 12:37 AM, David Brown wrote:

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.

Wow! That's catastrophically bad.

As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.

No, expression-level programming has always been and will likely always
remain a very minor part of C programming. Yes, some people make use of
the comma operator. Some people do so extensively - and they are often,
but not necessarily, considered "smart-arse" programmers rather than
"smart" programmers. If the comma operator were removed from the C
language, I guess some 95% of programmers would barely notice - at
worst, they would have to add an extra line inside an occasional "for"
loop. (The tertiary operator is used much more.)

I did not say that the use of comma operators is always bad - I said I
do not recall seeing it in the kind of code I work with in a situation
where I thought it was a good way to write the code. A significant part
of that is the kind of code I work with - in code for small systems
where high reliability and safety is vital, code clarity is of utmost importance. Code that does not do what it first appears to do is
severely frowned upon. Code is written in a very imperative style.

In my world, code that uses "malloc" is rarely acceptable, and for most programs, "double" is very seldom an appropriate choice of type. But
that does not mean these are not usable for other kinds of C
programming. There are many reasons why different styles of coding are
used in different circumstances.

Even when C++ is used, with its significantly broader support for a
variety of programming paradigms, I do not recall seeing the comma
operator used.

The proclivity to stick exclusively to statement-level programming in C
and, God forbid, impose it in others through so called "code reviews"...
that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
why we should set our standards that low here, in `comp.lang.c`.

I don't quite see how you are in any position to judge the coding styles
used by people you know nothing about, working in fields that you know
nothing about.

I am happy that different types of programming styles and paradigms are
used for different purposes - imperative C is not suitable for most
coding tasks. Equally, expression-style programming is not appropriate
for all coding tasks.

However, one thing that is never suitable for any real-world programming
is deceptive code that is not what it appears to be.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Thu May 29 21:15:56 2025

David Brown <[email protected]> writes:

On 29/05/2025 14:49, Andrey Tarasevich wrote:

On Wed 5/7/2025 12:37 AM, David Brown wrote:

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.

Wow! That's catastrophically bad.

As it has been stated many times before, both C and C++ are programming
languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for
branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.

No, expression-level programming has always been and will likely always >remain a very minor part of C programming. Yes, some people make use of
the comma operator. Some people do so extensively - and they are often,
but not necessarily, considered "smart-arse" programmers rather than
"smart" programmers. If the comma operator were removed from the C
language, I guess some 95% of programmers would barely notice - at
worst, they would have to add an extra line inside an occasional "for"
loop. (The tertiary operator is used much more.)

And sometimes, excessive use of the comma operator causes
compiler failures.

cfront generated the comma operator extensively, and expression trees
would grow to very large sizes. There was a bug in PCC (for the
88100) where it would run out of temporary registers while generating
code for some cfront generated comma expressions (which were -far- from
human readable). I had to fix the temporary register allocation
code in PCC to spill registers when the sethi-ullman number for an
expression exceeded the number of registers.

That was circa 1990, and I've generally not found any arguments
favoring their general use persuasive in the years since, including
Andrey's and Kaz's responses recently posted here.

The simple fact that experienced programmers that read this usenet
newsgroup missed the comma operators in the original example speaks
volumes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Fri May 30 10:50:56 2025

On 29/05/2025 23:15, Scott Lurndal wrote:

David Brown <[email protected]> writes:

On 29/05/2025 14:49, Andrey Tarasevich wrote:

On Wed 5/7/2025 12:37 AM, David Brown wrote:

That would get an immediate downcheck during review for exactly
that reason.

Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.

Wow! That's catastrophically bad.

As it has been stated many times before, both C and C++ are programming
languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for
branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.

No, expression-level programming has always been and will likely always
remain a very minor part of C programming. Yes, some people make use of
the comma operator. Some people do so extensively - and they are often,
but not necessarily, considered "smart-arse" programmers rather than
"smart" programmers. If the comma operator were removed from the C
language, I guess some 95% of programmers would barely notice - at
worst, they would have to add an extra line inside an occasional "for"
loop. (The tertiary operator is used much more.)

And sometimes, excessive use of the comma operator causes
compiler failures.

That is also an issue in the world of small-systems embedded programming.

While a lot of it these days is on ARM, and most of that is done using
gcc, there are hundreds of C compilers of varying quality (and price,
which is no indication of quality) for embedded systems. Many of these
other toolchains have bugs, non-conformities, inconsistencies and
weaknesses. (gcc is not perfect either!) People programming for 8-bit
and 16-bit microcontrollers using such tools will - and should - use a conservative subset of the C language. Obscure and rarely used features
of the language will be avoided, and code will be written in a simpler
manner. You don't write code in a way that increases the risk of
catching a bug in a poorly tested part of the compiler, or in a way that
might lead to unexpectedly inefficient results.

With 32-bit ARM now dominating the industry, along with more reliable
tools (primarily gcc, with clang a distant second), there is much less
need to pander to flaws in the toolchain but you still need to consider weaknesses in humans. When you are writing software that is intended to
run continuously without problems for years, or where a misunderstanding
by a new maintainer a decade later can lead to safety risks, you don't
write "cool" code!

It is well known that "Debugging is twice as hard as writing the code in
the first place. Therefore, if you write the code as cleverly as
possible, you are, by definition, not smart enough to debug it." I
would also say that understanding and maintaining other people's code is
often a lot more than twice as hard as writing the code yourself. I aim
to code accordingly.

cfront generated the comma operator extensively, and expression trees
would grow to very large sizes. There was a bug in PCC (for the
88100) where it would run out of temporary registers while generating
code for some cfront generated comma expressions (which were -far- from
human readable). I had to fix the temporary register allocation
code in PCC to spill registers when the sethi-ullman number for an
expression exceeded the number of registers.

That was circa 1990, and I've generally not found any arguments
favoring their general use persuasive in the years since, including
Andrey's and Kaz's responses recently posted here.

The simple fact that experienced programmers that read this usenet
newsgroup missed the comma operators in the original example speaks
volumes.

Of course people are experienced in different things - "programming",
even when limited to a single language, is a broad field.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Fri May 30 14:29:22 2025

Keith Thompson <[email protected]> writes:

[email protected] (Scott Lurndal) writes:
[...]

And sometimes, excessive use of the comma operator causes
compiler failures.

cfront generated the comma operator extensively, and expression trees
would grow to very large sizes. There was a bug in PCC (for the
88100) where it would run out of temporary registers while generating
code for some cfront generated comma expressions (which were -far- from
human readable). I had to fix the temporary register allocation
code in PCC to spill registers when the sethi-ullman number for an
expression exceeded the number of registers.

That was circa 1990, and I've generally not found any arguments
favoring their general use persuasive in the years since, including
Andrey's and Kaz's responses recently posted here.

So a compiler you used circa 1990 had problems with comma expressions.

That's hardly an argument against using comma operators today.

Tru, was an anecdote, not an argument, which conditioned
my opinion of comma operators.

The simple fact that experienced programmers that read this usenet
newsgroup missed the comma operators in the original example speaks
volumes.

*That's* a valid argument.

Indeed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Andrey Tarasevich on Fri Jun 6 17:44:14 2025

Andrey Tarasevich <[email protected]> writes:

On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:

Keith Thompson <[email protected]> wrote:

Andrey Tarasevich <[email protected]> writes:
[...]

#include <stdio.h>

struct S { int a[10]; };

int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;

pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}

This version has no UB.

I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.

N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."

Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.

Exactly. I thought the nature of the corrections I made (i.e. the
deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

My own reaction is that the changes were not by themselves
strikingly obvious. But in combination with the explicit
statement that "This version has no UB" it seems obvious
enough.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Rixter
  Wed Jul 29 02:00:40 2026
  from Madison, Nc via Telnet
- Centurion
  Tue Jul 28 22:54:59 2026
  from Berea, Ohio via Telnet
- Bob Worm
  Tue Jul 28 16:01:18 2026
  from Wales, Uk via Telnet
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet
- Ataricrypt
  Mon Jul 27 19:19:17 2026
  from England via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	58:01:43
Calls:	12,446
Calls today:	1
Files:	15,192
Messages:	6,537,395

Regarding assignment to struct

Who's Online

Recent Visitors

System Info