Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Lew Pitcher <[email protected]> wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Typically this is fine. However, in sdcc-4.2 manual one can find
the following statement:
: Deviations from standard compliance:
: structures and unions cannot be passed as function parameters
: and cannot be a return value from a function,....
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
Lawrence D'Oliveiro <[email protected]d> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a
defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
Finally, why would you care?
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.
The fact that an implementation does not have to do the equivalent of >memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().
On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:
That sort of thinking is an example of Knutian premature optimization.
Trying to hold back the optimization tide?
On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
Weird. Virtually every C project relies on assignment of
structures. Passing-returning structs by value might be more rare
(although perfectly valid and often appropriate too), but
assignment... assignment is used by everyone everywhere without even
giving it a second thought.
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <[email protected]d> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)" >(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure >assignment need not copy any padding bits."
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I have a project in which these capabilities might come in handy; has
anyone had experience with assigning to structures, passing them as
arguments to functions, and/or having a function return a structure?
Would code like
struct ab {
int a;
char *b;
} result, function(void);
if ((result = function()).a == 10) puts(result.b);
be understandable, or even legal?
James Kuyper <[email protected]> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <[email protected]d> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."
Are there any C implementations in common use that don't just
use memcpy or an optimized version thereof?
James Kuyper <[email protected]> writes:...
On 5/3/25 20:37, Keith Thompson wrote:
...I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
Finally, why would you care?
The fact that an implementation does not have to do the equivalent of
memcpy() to perform a struct copy means that successful assignment
cannot be checked by using memcmp().
Are you referring to checking whether an assignment was performed
or not, due to uncertainty about what the program has done? If you
mean doing an assignment and then checking whether it succeeded,
I can't think of a context where that makes sense.
James Kuyper <[email protected]> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <[email protected]d> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving it a
second thought.
There is a caveat, to do with alignment padding: will this always have a >>>> defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.56)"
(6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example, structure
assignment need not copy any padding bits."
Yes, that's what I missed.
It's interesting that the footnote refers to padding *bits* rather than >padding *bytes*. I presume this was unintentional.
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
On 02/05/2025 20:34, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
So, what you are basing your "UB" claim on is not clear to me.
And more obviously, "%p" requires an argument of type void*, not int*.
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:
And more obviously, "%p" requires an argument of type void*, not int*.
That part of otherwise very good comment is unreasonably pedantic.
On Sat, 3 May 2025 11:46:30 +0200
David Brown <[email protected]> gabbled:
On 02/05/2025 20:34, Lew Pitcher wrote:
Back in the days of K&R, Kernighan and Ritchie published an addendum
to the "C Reference Manual" titled "Recent Changes to C" (November 1978) >>> in which they detailed some differences in the C language post "The
C Programming Language".
The first difference they noted was that
"Structures may be assigned, passed as arguments to functions, and
returned by functions."
From what I can see of the ISO C standards, the current C language
has kept these these features. However, I don't see many C projects
using them.
I use these features regularly. I have no problem passing structs
around if that is the convenient way to structure the code.
If you twant o pass an actual array to a function instead of a pointer
to it,
embedding it in a structure is the only way to do it.
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
So, what you are basing your "UB" claim on is not clear to me.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
On Sun, 4 May 2025 22:22:12 -0700
Andrey Tarasevich <[email protected]> wrote:
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++)
the result of an assignment operator is an rvalue, which can
easily lead to some interesting consequences related to structs
with arrays inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and
its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends. [...]
Such an object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate
temporary at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
[...]
I think that code has undefined behavior.
(a = b) is an rvalue that refers to an object of type struct S with
temporary lifetime. pc holds the address of a subobject of that
temporary object. The object reaches the end of its lifetime at the end
of the evaluation of the full expression. You then print its value.
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++) the
result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion. And
no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.
Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.
But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:
An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.
Normally any two objects with overlapping lifetime must have distinct addresses. This addition, I think, gives compilers permission to have temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).
Andrey Tarasevich <[email protected]> writes:
On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:
One dark corner this feature has, is that in C (as opposed to C++) the >>>> result of an assignment operator is an rvalue, which can easily lead
to some interesting consequences related to structs with arrays
inside.
I'm curious to know what interesting consequences you mean here. Do
you mean something other than cases that have undefined behavior?
I'm referring to the matter of the address identity of the resultant
rvalue object. At first, "address identity of rvalue" might sound
strange, but the standard says that there's indeed an object tied to
such rvalue, and once we start applying array-to-pointer conversion
(and use `[]` operator), lvalues and addresses quickly come into the
picture.
The standard says in 6.2.4/8:
"A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type [...]
refers to an object with automatic storage duration and temporary
lifetime. Its lifetime begins when the expression is evaluated and its
initial value is the value of the expression. Its lifetime ends when
the evaluation of the containing full expression ends. [...] Such an
object need not have a unique address."
https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8
The last sentence there is not present in N1570. Apparently it was introduced later, in C17. (My appreciation to Keith Thompson for
reporting this.)
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
Ahh, I see now what your concern is.
But is it, perhaps, intended to also allow such temporaries to have
addresses identical to regular named objects? It is not immediately
clear to me.
My reading of the post-C11 standards is that they allow the "new"
object to overlap with already existing objects, including both
declared objects and objects whose storage was allocated using
malloc().
And when I make the following experiment with GCC and Clang
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
I consistently get the following output from GCC
0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544
And this is what I get from Clang
0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4
As you can see, GCC apparently took C++-like approach to this
situation. The returned "temporary" is not really a separate temporary
at all, but actually `a` itself.
Yeah.
Meanwhile, in Clang all three pointers are different, i.e. Clang
decided to actually create a separate temporary object for the result
of the assignment.
Which in my reading of the standard is required under C11 rules.
I have reproduced your results under -std=c11 -pedantic, for both
gcc and clang.
I have a strong feeling that GCC's behavior is non-conforming. The
last sentence of 6.2.4/8 is not supposed to permit "projecting" the
resultant temporaries onto existing named objects. I could be wrong...
My judgment is that the behavior under gcc is non-conforming if the compilation was done using C11 semantics. Under C17 or later rules
the gcc behavior is allowed (and may have been what prompted the
change in C17, but that is just speculation on my part). In any
case I understand now what you were getting at. Thank you for
bringing this hazard to the group's attention.
I hope someone files a bug report for gcc using -std=c11 rules,
because what gcc does under that setting (along with -pedantic)
is surely at odds with the plain reading of the C11 standard,
for the situation being discussed here.
Editorial comment: here is yet another case where post-C11 changes
to the C standard seem ill advised, and another reason not to use
any version of the ISO C standard for C17 or later. And it's
disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.
On Mon 5/5/2025 2:01 AM, Michael S wrote:
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
No, it isn't.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).
As for the expressions like
&(a = b).a[5];
and
&foo().a[2]
- these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
Michael S <[email protected]> writes:
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:
And more obviously, "%p" requires an argument of type void*, not int*.
That part of otherwise very good comment is unreasonably pedantic.
I disagree. I suggest it's a bad habit to use "%p" without ensuring,
by a cast if necessary, that the argument is of type void*.
In most implementations, it's likely that all pointers have the same
size and representation and are passed as arguments in the same way,
but getting the types right means one less thing to worry about.
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. [...]
Keith Thompson <[email protected]> writes:
James Kuyper <[email protected]> writes:
On 5/3/25 20:37, Keith Thompson wrote:
Lawrence D'Oliveiro <[email protected]d> writes:
On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
Virtually every C project relies on assignment of structures.
Passing-returning structs by value might be more rare (although
perfectly valid and often appropriate too), but assignment...
assignment is used by everyone everywhere without even giving
it a second thought.
There is a caveat, to do with alignment padding: will this
always have a defined value?
I don't believe so. In a quick look, I don't see anything in
the standard that explicitly addresses this, but I believe that a
conforming implementation could implement structure assignment by
copying the individual members, leaving any padding in the target
undefined.
"When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values.56)" (6.2.6.1p6).
That refers to footnote 56, which says "Thus, for example,
structure assignment need not copy any padding bits."
Yes, that's what I missed.
It's interesting that the footnote refers to padding *bits* rather
than padding *bytes*. I presume this was unintentional.
Padding bits:
struct A {
uint64_t tlen : 16,
: 20,
pkind : 6,
fsz : 6,
gsz : 14,
g : 1,
ptp : 1;
} s;
There are 20 padding bits in this declaration. Perhaps that's
what they're referring to?
On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <[email protected]> wrote:
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:
And more obviously, "%p" requires an argument of type void*, not
int*.
That part of otherwise very good comment is unreasonably pedantic.
On Mon, 5 May 2025 08:45:09 -0700
Andrey Tarasevich <[email protected]> wrote:
On Mon 5/5/2025 2:01 AM, Michael S wrote:
On Mon, 5 May 2025 01:29:47 -0700
Andrey Tarasevich <[email protected]> wrote:
On Mon 5/5/2025 1:12 AM, Michael S wrote:
According to my understanding, you are wrong.
Taking pointer of non-lvalue is UB, so anything compiler does is
conforming.
Er... What? What specifically do you mean by "taking pointers"?
The whole functionality of `[]` operator in C is based on pointers.
This expression
(a = b).a[5]
[...]
is already doing your "taking pointers of non-lvalue" (if I
understood you correctly) as part of array-to-pointer conversion.
And no, it is not UB.
This is not UB either
struct S foo(void) { return (struct S) { 1, 2, 3 }; }
...
int *p;
p = &foo().a[2], printf("%d\n", *p);
That is not UB:
int a5 = (a = b).a[5];
That is UB:
int* pa5 = &(a = b).a[5];
No, it isn't.
If you read the post of Keith Thompson and it is still not clears to
you then I can not help.
The only valid "UB" claim in Keith's post is my printing the value of
`pc` pointer, which by that time happens to point nowhere, since the
lifetime of the temporary is over. (And, of course, lack of
conversion to `void *` is an issue).
As for the expressions like
&(a = b).a[5];
and
&foo().a[2]
Expressions by themselves a valid. But since there is no situation in
which the value produced by expressions is valid outside of expressions
the compiler can generate any value it wants, even NULL or value
completely outside of address space of current process.
- these by themselves are are perfectly valid. There's no UB in these
expressions. (And this is not a debate.)
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.
On 2025-05-05, Keith Thompson <[email protected]> wrote:
Michael S <[email protected]> writes:
On Mon, 05 May 2025 01:34:16 -0700
Keith Thompson <[email protected]> wrote:
And more obviously, "%p" requires an argument of type void*, not
int*.
That part of otherwise very good comment is unreasonably pedantic.
I disagree. I suggest it's a bad habit to use "%p" without
ensuring, by a cast if necessary, that the argument is of type
void*.
In most implementations, it's likely that all pointers have the
same size and representation and are passed as arguments in the
same way, but getting the types right means one less thing to worry
about.
If the codebade assumes all data pointers are the same size, bit
pattern and are treated the same in the calling conventions / ABI,
then it is probably moot.
That code is doomed on a platform where the assumption doesn't
hold, and the printf statemnts are probably not independently
reusable.
(I mostly put in these casts just to communicate to others that
an ISO C language lawyer works here, if you happen to need one.)
Also, it owuld be amazingly stupid of any such platform not just
make those printfs work: to promote variadic arguments of
pointer-to-object type to a common representation which is the
same as void *, combined with a matching behavior in the va_arg
macro for extracting the value back into any pointer-to-object
type.
[email protected] writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
[email protected] writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
On 05/05/2025 22:53, Keith Thompson wrote:
[email protected] writes:
[...]
If you twant o pass an actual array to a function instead of a pointer to >it,
embedding it in a structure is the only way to do it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
I regularly use arrays with known fixed sizes. In fact, in my code
those are absolutely dominant - it is very rare for me to see or use an
array whose size is /not/ fixed at compile time. Sometimes I will have
[email protected] writes:
[...]
If you twant o pass an actual array to a function instead of a
pointer to it, embedding it in a structure is the only way to do
it.
Yes, but that's not necessarily useful. An array that's a member
of a struct can only be of a constant length (unless it's a flexible
array member, but that doesn't help). Functions that work with
arrays typically need to deal with arrays of arbitrary length.
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
Tim Rentsch <[email protected]> writes:[...]
Keith Thompson <[email protected]> writes:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5];
pb = &b.a[5];
pc = &(a = b).a[5];
printf("%p %p %p\n", pa, pb, pc);
}
[...]
I think that code has undefined behavior.
Right. [*]
[*] Assuming C11 semantics. At best inadvisable under C99
semantics, and a constraint violation under C90 semantics.
What C90 constraint does it violate? Both gcc and clang reject it
with "-std=c90 -pedantic-errors", with an error message "ISO C90
forbids subscripting non-lvalue array", but I don't see a relevant
constraint in the C90 standard.
I know that C11 introduced "temporary lifetime" to cover cases
like this. In C99, the wording for the indexing operator implicitly
assumes that there's an array object; if there isn't, I'd argue the
behavior is undefined by omission. I'm not aware of any relevant
change from C90 to C99.
On 06/05/2025 19:36, Waldek Hebisch wrote:
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
I must admit I had not noticed that detail.
David Brown <[email protected]> writes:
On 06/05/2025 19:36, Waldek Hebisch wrote:
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
I must admit I had not noticed that detail.
That would get an immediate downcheck during review for exactly
that reason.
Nick Bowler <[email protected]> writes:[...]
The rule about conversions from arrays to pointers is different in C99
(n1124 6.3.2.1, third paragraph) compared to C89. In particular,
"an lvalue that has type ``array of type'' ..." was changed to
"an expression that has type ``array of type'' ...".
The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.
BTW, you have a copy of ANSI C89? Hard or soft copy? Do you know if
it's still available in some form?
On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
Nick Bowler <[email protected]> writes:
The rule about conversions from arrays to pointers is different
in C99 (n1124 6.3.2.1, third paragraph) compared to C89. In
particular, "an lvalue that has type ``array of type'' ..." was
changed to "an expression that has type ``array of type'' ...".
[...]
The change from "lvalue" to "expression" was made in C99. I
wonder why that was done.
It's not mentioned in the rationale, so we can only guess. [...]
Nick Bowler <[email protected]> writes:
On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
The change from "lvalue" to "expression" was made in C99. I wonder why
that was done.
It's not mentioned in the rationale, so we can only guess. But it is
called out in the list of major changes in the C99 foreword.
I've just looked at the foreword of the C99 standard and the n1256
draft, and I couldn't find it. Can you quote the precise wording?
On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:
I wondering what the last sentence is intended to mean ("... need not
have a unique address"). At the first sight, the intent seems to be
obvious: it simply says that such temporary objects might repeatedly
appear (and disappear) at the same location in storage, which is a
natural thing to expect.
You snipped this: "Any attempt to modify an object with temporary
lifetime results in undefined behavior.". Which means, I think,
that an implementation that shared storage for "such an object"
with something else probably isn't going to cause problems for any
code with defined behavior.
It is going to cause problems, if the code relies on the address
identity of the object, assuming the standard intends to provide such guarantees.
Though I can imagine the possibility of code that modifies `a` and
reads via `pc` within the same full expression.
That's easy (in the context of declarations from my previous example):
pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);
As one would expect, this produces different output in GCC and Clang
for the reasons I already described.
But unless I've somehow missed it, the "Such an object need not
have a unique address." wording doesn't appear on that web page or
in my copy of n1570.pdf. C17 does add these two sentences:
An object with temporary lifetime behaves as if it were declared
with the type of its value for the purposes of effective type. Such
an object need not have a unique address.
Normally any two objects with overlapping lifetime must have distinct
addresses. This addition, I think, gives compilers permission to have
temporary lifetime objects overlap with other existing objects, but not
to have a modification to one object affect the value of the other
(unless the modification invokes UB, of course).
If so, that would be extremely underspecified. A mere "such an object
need not have a unique address" is insufficient to fully convey the permission to overlap existing named objects.
And that's probably what led to difference in interpretation
between GCC and Clang.
Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.
Andrey Tarasevich <[email protected]> writes:
And that's probably what led to difference in interpretation
between GCC and Clang.
I suspect the implication actually goes the other way. It is
because what gcc has done (past tense) violates the rules of the C11
standard that someone had the bright idea that the C standard should
be changed to allow this stupidity.
Modification of the temporary is "prohibited" (as UB), but
modification of the overlapped named object is not. The
consequences can be quite surprising.
In my view the problem is not that what is allowed is unclear, but
that the whole idea of possibly overlapping objects is a crock.
It's a sad statement on the quality of gcc that it does the wrong
thing even when -std=c11 and -pedantic are given as compilation
options. Bleah.
On Sat, 3 May 2025 21:42:37 -0400
Richard Damon <[email protected]> wrote:
Bigger than that, and you likely want to pass the object by address,
not by value, passing just a pointer to it.
That sort of thinking is an example of Knutian premature optimization.
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
it is form of UB.
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
It seems clear to me that "pc" has an indeterminate value after the expression assigning, since it points to an object with temporary lifetime.
And attempting to use the value of an object with automatic storage
while it has an indeterminate value is undefined behaviour.
As far as I can see, simply reading the value in "pc" to print it out is
UB according to the C standards. It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
registers are more complicated than simple linear addresses - they can
track some kind of segment structure describing the range of a data
block, or permissions for access to the data, and such structures could
have been deactivated or deallocated when the temporary lifetime object died. Even attempting to read the value of the pointer, without dereferencing it, would then cause some kind of fault or trap.
What C90 constraint does it violate? Both gcc and clang reject it
with "-std=c90 -pedantic-errors", with an error message "ISO C90
forbids subscripting non-lvalue array", but I don't see a relevant
constraint in the C90 standard.
That would get an immediate downcheck during review for exactly
that reason.
Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or smart-
arse coding. In all my C coding experience, I can't recall ever coming across a single situation when I thought the use of the comma operator
was appropriate in the kind of code I work with.
On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
Exactly. I thought the nature of the corrections I made (i.e. the
deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...
On Wed 5/7/2025 12:37 AM, David Brown wrote:
That would get an immediate downcheck during review for exactly
that reason.
Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.
Wow! That's catastrophically bad.
As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.
The proclivity to stick exclusively to statement-level programming in C
and, God forbid, impose it in others through so called "code reviews"...
that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
why we should set our standards that low here, in `comp.lang.c`.
Here's a version of the same code that corrects the above distracting
issues
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
It's only not UB in the nazal demons sense.
It's UB in a sense that we can't predict values of expressions
like (pa==pc) and (pb==pc). ...
... I.e. pc is completely useless. In my book
it is form of UB.
On Tue 5/6/2025 2:35 AM, David Brown wrote:
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
It seems clear to me that "pc" has an indeterminate value after the
expression assigning, since it points to an object with temporary
lifetime.
And attempting to use the value of an object with automatic storage
while it has an indeterminate value is undefined behaviour.
Again, this makes no sense. Please, pay attention to the code and the corrections made after the initial version (e.g. usage of comma
operator). No, the value of `pc` is not indeterminate, and no, there's
no undefined behavior in the above version of the code.
As far as I can see, simply reading the value in "pc" to print it out
is UB according to the C standards. It is clearly going to be a
harmless operation on most hardware, but there are processors where
pointer registers are more complicated than simple linear addresses -
they can track some kind of segment structure describing the range of
a data block, or permissions for access to the data, and such
structures could have been deactivated or deallocated when the
temporary lifetime object died. Even attempting to read the value of
the pointer, without dereferencing it, would then cause some kind of
fault or trap.
Again, irrelevant. In the above code the temporary object does not die
during the entire period when pointer `pc` is used in any way.
On Wed 5/7/2025 12:37 AM, David Brown wrote:
That would get an immediate downcheck during review for exactly
that reason.
Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.
Wow! That's catastrophically bad.
As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.
The proclivity to stick exclusively to statement-level programming in C
and, God forbid, impose it in others through so called "code reviews"...
that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
why we should set our standards that low here, in `comp.lang.c`.
On 29/05/2025 14:49, Andrey Tarasevich wrote:
On Wed 5/7/2025 12:37 AM, David Brown wrote:
That would get an immediate downcheck during review for exactly
that reason.
Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.
Wow! That's catastrophically bad.
As it has been stated many times before, both C and C++ are programming
languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for
branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.
No, expression-level programming has always been and will likely always >remain a very minor part of C programming. Yes, some people make use of
the comma operator. Some people do so extensively - and they are often,
but not necessarily, considered "smart-arse" programmers rather than
"smart" programmers. If the comma operator were removed from the C
language, I guess some 95% of programmers would barely notice - at
worst, they would have to add an extra line inside an occasional "for"
loop. (The tertiary operator is used much more.)
David Brown <[email protected]> writes:
On 29/05/2025 14:49, Andrey Tarasevich wrote:
On Wed 5/7/2025 12:37 AM, David Brown wrote:
That would get an immediate downcheck during review for exactly
that reason.
Of course. In fact, if someone presented such code for review (and
assuming I noticed the commas!) I'd have to consider whether it was
done maliciously, intentionally deceptively, due to incompetence, or
smart- arse coding. In all my C coding experience, I can't recall
ever coming across a single situation when I thought the use of the
comma operator was appropriate in the kind of code I work with.
Wow! That's catastrophically bad.
As it has been stated many times before, both C and C++ are programming
languages that embrace both statement-level and expression-level
programming. Expression-level programming (e.g. where `?:` is used for
branching and `,` for sequencing) is a very valuable and massively
important programming paradigm in these languages. The fact that
elaborate expression-level programming is not in nay way abandoned or
shunned today is pretty obvious in C++, since C++ took major steps
lately to develop its expression-level capabilities. But it has always
been and will always remain important in C as well.
No, expression-level programming has always been and will likely always
remain a very minor part of C programming. Yes, some people make use of
the comma operator. Some people do so extensively - and they are often,
but not necessarily, considered "smart-arse" programmers rather than
"smart" programmers. If the comma operator were removed from the C
language, I guess some 95% of programmers would barely notice - at
worst, they would have to add an extra line inside an occasional "for"
loop. (The tertiary operator is used much more.)
And sometimes, excessive use of the comma operator causes
compiler failures.
cfront generated the comma operator extensively, and expression trees
would grow to very large sizes. There was a bug in PCC (for the
88100) where it would run out of temporary registers while generating
code for some cfront generated comma expressions (which were -far- from
human readable). I had to fix the temporary register allocation
code in PCC to spill registers when the sethi-ullman number for an
expression exceeded the number of registers.
That was circa 1990, and I've generally not found any arguments
favoring their general use persuasive in the years since, including
Andrey's and Kaz's responses recently posted here.
The simple fact that experienced programmers that read this usenet
newsgroup missed the comma operators in the original example speaks
volumes.
[email protected] (Scott Lurndal) writes:
[...]
And sometimes, excessive use of the comma operator causes
compiler failures.
cfront generated the comma operator extensively, and expression trees
would grow to very large sizes. There was a bug in PCC (for the
88100) where it would run out of temporary registers while generating
code for some cfront generated comma expressions (which were -far- from
human readable). I had to fix the temporary register allocation
code in PCC to spill registers when the sethi-ullman number for an
expression exceeded the number of registers.
That was circa 1990, and I've generally not found any arguments
favoring their general use persuasive in the years since, including
Andrey's and Kaz's responses recently posted here.
So a compiler you used circa 1990 had problems with comma expressions.
That's hardly an argument against using comma operators today.
The simple fact that experienced programmers that read this usenet
newsgroup missed the comma operators in the original example speaks
volumes.
*That's* a valid argument.
On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:
Keith Thompson <[email protected]> wrote:
Andrey Tarasevich <[email protected]> writes:
[...]
#include <stdio.h>
struct S { int a[10]; };
int main()
{
struct S a, b = { 0 };
int *pa, *pb, *pc;
pa = &a.a[5],
pb = &b.a[5],
pc = &(a = b).a[5],
printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
}
This version has no UB.
I believe it does. pc points to an element of an object with
temporary lifetime. The value of pc is then used after the object
it points to has reached the end of its lifetime. At that point,
pc has an indeterminate value.
N3096 6.2.4p2: "If a pointer value is used in an evaluation after
the object the pointer points to (or just past) reaches the end of
its lifetime, the behavior is undefined. The representation of a
pointer object becomes indeterminate when the object the pointer
points to (or just past) reaches the end of its lifetime."
Note commas above. Assignment to pc and call to printf are parts
of a single expression, so use of pc is within lifetime of the
temporary object.
Exactly. I thought the nature of the corrections I made (i.e. the
deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 141:10:36 |
| Calls: | 12,087 |
| Files: | 14,998 |
| Messages: | 6,517,434 |