• Regarding assignment to struct

    From Lew Pitcher@21:1/5 to All on Fri May 2 18:34:52 2025
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?


    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Barry Schwarz@21:1/5 to [email protected] on Fri May 2 13:35:50 2025
    On Fri, 2 May 2025 18:34:52 -0000 (UTC), Lew Pitcher <[email protected]> wrote:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?

    Wouldn't it be quicker and easier to write a simple program to test
    this rather than wait for someone to compose a response? You already
    have 90% of the code written.

    --
    Remove del for email

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to Lew Pitcher on Fri May 2 21:35:31 2025
    Lew Pitcher <[email protected]> wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Typically this is fine. However, in sdcc-4.2 manual one can find
    the following statement:

    : Deviations from standard compliance:
    : structures and unions cannot be passed as function parameters
    : and cannot be a return value from a function,....

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Waldek Hebisch on Sat May 3 01:43:54 2025
    On Fri, 02 May 2025 21:35:31 +0000, Waldek Hebisch wrote:

    Lew Pitcher <[email protected]> wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Typically this is fine. However, in sdcc-4.2 manual one can find
    the following statement:

    : Deviations from standard compliance:
    : structures and unions cannot be passed as function parameters
    : and cannot be a return value from a function,....

    Not a problem. I don't foresee that the code I'm working on would be
    compiled with a non-standard C compiler (assuming that I've read
    the standards correctly wrt struct pass-by-value).

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Lew Pitcher on Sat May 3 01:14:46 2025
    On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    Weird. Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment... assignment
    is used by everyone everywhere without even giving it a second thought.

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead to
    some interesting consequences related to structs with arrays inside.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lew Pitcher on Sat May 3 11:46:30 2025
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly. I have no problem passing structs
    around if that is the convenient way to structure the code.

    Some people mistakenly think that it is very inefficient, in comparison
    to passing around pointers to structs (which is the usual alternative).
    There are circumstances where you might end up with an extra struct and
    an extra copy, but unless you are dealing with very big structs and
    otherwise very fast functions, it's unlikely to be significant. Modern
    ABI's support passing small structs around in registers, and bigger
    structs get passed around using hidden pointers - using the structs in
    your code, rather than pointers to structs, makes the code clearer,
    safer, and gives the optimiser more information for better static
    analysis and code generation.


    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?


    I'd immediately reject any code that mixes declaration of a variable and
    a function in the same declaration. I'd immediately reject any code
    that defines a type and declares a function in one shot. I'd question
    code that defines a type and a variable in one go. But that's my way of
    coding - other people have different rules, and your declarations are legal.

    Personally, I'd have :

    typedef struct {
    int a;
    char * b;
    } ab;

    ab result;

    ab function(void);

    (Obviously "ab" would not be a likely name in real code.)

    Once the type "ab" is defined, I am quite happy making variables of it, assigning them, and using it for function parameters and return types.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Andrey Tarasevich on Sat May 3 22:46:55 2025
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

    Virtually every C project relies on assignment of structures. Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a defined value?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Damon@21:1/5 to Lew Pitcher on Sat May 3 21:42:37 2025
    On 5/2/25 2:34 PM, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?



    I will say that I have used the feature, but in very limited conditions,
    mostly where the structure is no bigger than one or two typical words.

    It could be a structure with a bitfield to make accesses clearer than
    low level masking, or for a "point" with x and y tied into one object.

    Bigger than that, and you likely want to pass the object by address, not
    by value, passing just a pointer to it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Sat May 3 23:38:47 2025
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a
    defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Note that, even when writing to a single member, the representations in
    the padding bytes might be affected. A plausible reason for this to
    happen would be, for example when a value is written to an 8-bit strujct
    field followed by 8 bits of padding on a machine where the word size is
    16 bits. The wording of that clause permits the use of instructions that
    change the contents of an entire word to be used when updating that field.

    Finally, why would you care?

    The fact that an implementation does not have to do the equivalent of
    memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Richard Damon on Sun May 4 11:01:17 2025
    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <[email protected]> wrote:


    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Michael S on Sun May 4 08:34:11 2025
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    That sort of thinking is an example of Knutian premature optimization.

    Trying to hold back the optimization tide?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to [email protected] on Sun May 4 09:25:08 2025
    In article <vv6ng8$1410m$[email protected]>,
    James Kuyper <[email protected]> wrote:
    ...
    The fact that an implementation does not have to do the equivalent of >memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    Which then begs two questions:

    1) Why wouldn't an implementaton do it with memcpy()? That is likely
    to be as good or better than any other method, including, especially, a
    member-by-member copy.

    2) Why wouldn't you, the programmer, just use memcpy() instead of
    struct assignment? Yes, I realize there are other cases to consider,
    but in the simple one:

    struct something foo,bar;
    foo = bar;

    memcpy() seems like it would always be easier and more reliable.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/Pedantic

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Sun May 4 14:06:30 2025
    On 04/05/2025 10:34, Lawrence D'Oliveiro wrote:
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    That sort of thinking is an example of Knutian premature optimization.

    Trying to hold back the optimization tide?

    I think he meant Knuthian, rather than Knutian :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Sun May 4 06:48:13 2025
    Andrey Tarasevich <[email protected]> writes:

    On Fri 5/2/2025 11:34 AM, Lew Pitcher wrote:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    Weird. Virtually every C project relies on assignment of
    structures. Passing-returning structs by value might be more rare
    (although perfectly valid and often appropriate too), but
    assignment... assignment is used by everyone everywhere without even
    giving it a second thought.

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to James Kuyper on Sun May 4 14:27:01 2025
    James Kuyper <[email protected]> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)" >(6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure >assignment need not copy any padding bits."

    Are there any C implementations in common use that don't just
    use memcpy or an optimized version thereof?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Lew Pitcher on Sun May 4 07:49:15 2025
    Lew Pitcher <[email protected]> writes:

    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.

    I have a project in which these capabilities might come in handy; has
    anyone had experience with assigning to structures, passing them as
    arguments to functions, and/or having a function return a structure?

    Would code like
    struct ab {
    int a;
    char *b;
    } result, function(void);

    if ((result = function()).a == 10) puts(result.b);

    be understandable, or even legal?

    The style is unorthodox, but the code is understandable.

    Also it is both legal and well-defined, back to and
    including C90.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Sun May 4 18:45:32 2025
    On 04/05/2025 16:27, Scott Lurndal wrote:
    James Kuyper <[email protected]> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Are there any C implementations in common use that don't just
    use memcpy or an optimized version thereof?


    Sometimes small structs never make it to memory, or are handled by the
    compiler as though they were individual variables (as long as that is
    within "as-if" usage, of course). Copying a struct might merely mean
    the compiler keeps track of the logical copy without actually copying
    any memory. (You could argue that the compiler is still treating it
    like memcpy, as memcpy calls don't always copy something.)

    I think it would be unusual to see a significant difference between a
    struct assignment copy and a memcpy on a compiler that optimises memcpy
    well.

    But on a compiler that does not handle memcpy well, then a struct
    assignment could be inlined while a memcpy could mean an external
    library call with significant overhead.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Sun May 4 21:08:43 2025
    On 5/4/25 16:20, Keith Thompson wrote:
    James Kuyper <[email protected]> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    ...
    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.
    ...
    Finally, why would you care?

    The fact that an implementation does not have to do the equivalent of
    memcpy() to perform a struct copy means that successful assignment
    cannot be checked by using memcmp().

    Are you referring to checking whether an assignment was performed
    or not, due to uncertainty about what the program has done? If you
    mean doing an assignment and then checking whether it succeeded,
    I can't think of a context where that makes sense.

    Sorry, I didn't explain what I was thinking about in any detail. I've
    seen code that allows a data structure to be modified by one section of
    the code, and then periodically checks each object in that data
    structure (including aggregate objects) to see whether it has been
    modified by using memcmp() versus a saved copy. If so, it updated the
    saved copy, including a timestamp when it was updated. If it weren't for
    the need to keep track of the timestamp, it would always be simpler, and
    not much slower, to always replace the saved copy, whether or not
    there'd been a change.

    I should have made it clear that I basically understand and agree with
    your "why would you care" criticism. But it's part of my nature to look
    for the edge cases where differences that ordinarily don't matter, could matter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Mon May 5 00:41:03 2025
    Keith Thompson <[email protected]> writes:
    James Kuyper <[email protected]> writes:
    On 5/3/25 20:37, Keith Thompson wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:
    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving it a
    second thought.

    There is a caveat, to do with alignment padding: will this always have a >>>> defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object representation
    that correspond to any padding bytes take unspecified values.56)"
    (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example, structure
    assignment need not copy any padding bits."

    Yes, that's what I missed.

    It's interesting that the footnote refers to padding *bits* rather than >padding *bytes*. I presume this was unintentional.

    Padding bits:

    struct A {
    uint64_t tlen : 16,
    : 20,
    pkind : 6,
    fsz : 6,
    gsz : 14,
    g : 1,
    ptp : 1;
    } s;

    There are 20 padding bits in this declaration. Perhaps that's
    what they're referring to?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Tim Rentsch on Sun May 4 22:22:12 2025
    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion (and
    use `[]` operator), lvalues and addresses quickly come into the picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its
    initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression ends. [...] Such an object
    need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this situation.
    The returned "temporary" is not really a separate temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang decided
    to actually create a separate temporary object for the result of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The last sentence of 6.2.4/8 is not supposed to permit "projecting" the resultant temporaries onto existing named objects. I could be wrong...

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 11:12:13 2025
    On Sun, 4 May 2025 22:22:12 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++)
    the result of an assignment operator is an rvalue, which can
    easily lead to some interesting consequences related to structs
    with arrays inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and
    its initial value is the value of the expression. Its lifetime ends
    when the evaluation of the containing full expression ends. [...]
    Such an object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate
    temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...


    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Mon May 5 08:50:39 2025
    On Sat, 3 May 2025 11:46:30 +0200
    David Brown <[email protected]> gabbled:
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978)
    in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
    "Structures may be assigned, passed as arguments to functions, and
    returned by functions."

    From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly. I have no problem passing structs
    around if that is the convenient way to structure the code.

    If you twant o pass an actual array to a function instead of a pointer to it, embedding it in a structure is the only way to do it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 01:29:47 2025
    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers. This expression

    (a = b).a[5]

    is already doing your "taking pointers of non-lvalue" (if I understood
    you correctly) as part of array-to-pointer conversion. And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    So, what you are basing your "UB" claim on is not clear to me.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 12:01:45 2025
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion. And
    no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    So, what you are basing your "UB" claim on is not clear to me.


    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Mon May 5 12:03:31 2025
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <[email protected]> wrote:


    And more obviously, "%p" requires an argument of type void*, not int*.


    That part of otherwise very good comment is unreasonably pedantic.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to [email protected] on Mon May 5 11:30:43 2025
    In article <[email protected]>,
    Michael S <[email protected]> wrote:
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <[email protected]> wrote:


    And more obviously, "%p" requires an argument of type void*, not int*.


    That part of otherwise very good comment is unreasonably pedantic.

    That's KT for you. That's his reason for existence.

    Welcome to CLC!

    --
    Alice was something of a handful to her father, Theodore Roosevelt. He was once asked by a visiting dignitary about parenting his spitfire of a daughter and he replied, "I can be President of the United States, or I can control Alice. I cannot possibly do both."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to [email protected] on Mon May 5 13:34:45 2025
    On 05/05/2025 10:50, [email protected] wrote:
    On Sat, 3 May 2025 11:46:30 +0200
    David Brown <[email protected]> gabbled:
    On 02/05/2025 20:34, Lew Pitcher wrote:
    Back in the days of K&R, Kernighan and Ritchie published an addendum
    to the "C Reference Manual" titled "Recent Changes to C" (November 1978) >>> in which they detailed some differences in the C language post "The
    C Programming Language".

    The first difference they noted was that
       "Structures may be assigned, passed as arguments to functions, and
        returned by functions."

     From what I can see of the ISO C standards, the current C language
    has kept these these features. However, I don't see many C projects
    using them.


    I use these features regularly.  I have no problem passing structs
    around if that is the convenient way to structure the code.

    If you twant o pass an actual array to a function instead of a pointer
    to it,
    embedding it in a structure is the only way to do it.

    Yes.

    (Well, you could embed it in a union if you prefer, but a struct seems
    more likely.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:14:17 2025
    Michael S <[email protected]> writes:

    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]
    [...]
    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    So, what you are basing your "UB" claim on is not clear to me.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    Under C11 semantics, both

    int a5 = (a = b).a[5];

    and

    int* pa5 = &(a = b).a[5];

    have well-defined behavior. The undefined behavior of the
    upthread example comes later, only after the statement assigning
    to the pointer (or here, initializing) completes. It isn't
    taking the address with & that has undefined behavior; it is
    using the stored pointer value in a subsequent statement, /after
    the full expression containing the & operator has completed/,
    that results in undefined behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 07:03:40 2025
    Michael S <[email protected]> writes:

    On Sun, 4 May 2025 22:22:12 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++)
    the result of an assignment operator is an rvalue, which can
    easily lead to some interesting consequences related to structs
    with arrays inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and
    its initial value is the value of the expression. Its lifetime ends
    when the evaluation of the containing full expression ends. [...]
    Such an object need not have a unique address."
    https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate
    temporary at all, but actually `a` itself.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Maybe you are thinking of C90.

    In both C99 and C11, the expression

    (a = b).a[5]

    is an lvalue, so taking its address with & is allowed.

    It's easy to verify this assertion using gcc -std=c99 -pedantic. If
    the given expression were not an lvalue then taking its address with
    & would be a constraint violation, requiring a diagnostic. But no
    diagnostic is produced. (Using clang in place of gcc also produces
    no diagnostic.)

    The behavior under C99 semantics is arguably murky. But under C11
    semantics the behavior is well-defined.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 06:34:49 2025
    Keith Thompson <[email protected]> writes:

    Andrey Tarasevich <[email protected]> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    [...]

    I think that code has undefined behavior.

    Right. [*]

    (a = b) is an rvalue that refers to an object of type struct S with
    temporary lifetime. pc holds the address of a subobject of that
    temporary object. The object reaches the end of its lifetime at the end
    of the evaluation of the full expression. You then print its value.

    Even if the printf() statement were replaced by

    (void)pc;

    the behavior would be undefined, because the pointer held in pc
    becomes indeterminate as soon as the statement containing the
    assignment to pc completes.


    [*] Assuming C11 semantics. At best inadvisable under C99
    semantics, and a constraint violation under C90 semantics.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Mon May 5 07:56:40 2025
    Andrey Tarasevich <[email protected]> writes:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the
    result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when
    the evaluation of the containing full expression ends. [...] Such an
    object need not have a unique address." https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    The last sentence there is not present in N1570. Apparently it was
    introduced later, in C17. (My appreciation to Keith Thompson for
    reporting this.)

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    Ahh, I see now what your concern is.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    My reading of the post-C11 standards is that they allow the "new"
    object to overlap with already existing objects, including both
    declared objects and objects whose storage was allocated using
    malloc().

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate temporary
    at all, but actually `a` itself.

    Yeah.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    Which in my reading of the standard is required under C11 rules.
    I have reproduced your results under -std=c11 -pedantic, for both
    gcc and clang.

    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    My judgment is that the behavior under gcc is non-conforming if the
    compilation was done using C11 semantics. Under C17 or later rules
    the gcc behavior is allowed (and may have been what prompted the
    change in C17, but that is just speculation on my part). In any
    case I understand now what you were getting at. Thank you for
    bringing this hazard to the group's attention.

    I hope someone files a bug report for gcc using -std=c11 rules,
    because what gcc does under that setting (along with -pedantic)
    is surely at odds with the plain reading of the C11 standard,
    for the situation being discussed here.

    Editorial comment: here is yet another case where post-C11 changes
    to the C standard seem ill advised, and another reason not to use
    any version of the ISO C standard for C17 or later. And it's
    disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Michael S on Mon May 5 08:45:09 2025
    On Mon 5/5/2025 2:01 AM, Michael S wrote:
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion. And
    no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of conversion
    to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]

    - these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Keith Thompson on Mon May 5 10:14:40 2025
    On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    You snipped this: "Any attempt to modify an object with temporary
    lifetime results in undefined behavior.". Which means, I think,
    that an implementation that shared storage for "such an object"
    with something else probably isn't going to cause problems for any
    code with defined behavior.

    It is going to cause problems, if the code relies on the address
    identity of the object, assuming the standard intends to provide such guarantees.

    Though I can imagine the possibility of code that modifies `a` and
    reads via `pc` within the same full expression.

    That's easy (in the context of declarations from my previous example):

    pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

    As one would expect, this produces different output in GCC and Clang for
    the reasons I already described.

    But unless I've somehow missed it, the "Such an object need not
    have a unique address." wording doesn't appear on that web page or
    in my copy of n1570.pdf. C17 does add these two sentences:

    An object with temporary lifetime behaves as if it were declared
    with the type of its value for the purposes of effective type. Such
    an object need not have a unique address.

    Normally any two objects with overlapping lifetime must have distinct addresses. This addition, I think, gives compilers permission to have temporary lifetime objects overlap with other existing objects, but not
    to have a modification to one object affect the value of the other
    (unless the modification invokes UB, of course).

    If so, that would be extremely underspecified. A mere "such an object
    need not have a unique address" is insufficient to fully convey the
    permission to overlap existing named objects. And that's probably what
    led to difference in interpretation between GCC and Clang.

    Modification of the temporary is "prohibited" (as UB), but modification
    of the overlapped named object is not. The consequences can be quite surprising.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Tim Rentsch on Mon May 5 20:00:39 2025
    On 05/05/2025 16:56, Tim Rentsch wrote:
    Andrey Tarasevich <[email protected]> writes:

    On Sun 5/4/2025 6:48 AM, Tim Rentsch wrote:

    One dark corner this feature has, is that in C (as opposed to C++) the >>>> result of an assignment operator is an rvalue, which can easily lead
    to some interesting consequences related to structs with arrays
    inside.

    I'm curious to know what interesting consequences you mean here. Do
    you mean something other than cases that have undefined behavior?

    I'm referring to the matter of the address identity of the resultant
    rvalue object. At first, "address identity of rvalue" might sound
    strange, but the standard says that there's indeed an object tied to
    such rvalue, and once we start applying array-to-pointer conversion
    (and use `[]` operator), lvalues and addresses quickly come into the
    picture.

    The standard says in 6.2.4/8:

    "A non-lvalue expression with structure or union type, where the
    structure or union contains a member with array type [...]
    refers to an object with automatic storage duration and temporary
    lifetime. Its lifetime begins when the expression is evaluated and its
    initial value is the value of the expression. Its lifetime ends when
    the evaluation of the containing full expression ends. [...] Such an
    object need not have a unique address."
    https://port70.net/~nsz/c/c11/n1570.html#6.2.4p8

    The last sentence there is not present in N1570. Apparently it was introduced later, in C17. (My appreciation to Keith Thompson for
    reporting this.)

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    Ahh, I see now what your concern is.

    But is it, perhaps, intended to also allow such temporaries to have
    addresses identical to regular named objects? It is not immediately
    clear to me.

    My reading of the post-C11 standards is that they allow the "new"
    object to overlap with already existing objects, including both
    declared objects and objects whose storage was allocated using
    malloc().

    And when I make the following experiment with GCC and Clang

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    I consistently get the following output from GCC

    0x7fff73eb5544 0x7fff73eb5574 0x7fff73eb5544

    And this is what I get from Clang

    0x7ffd2b8dbf44 0x7ffd2b8dbf14 0x7ffd2b8dbee4

    As you can see, GCC apparently took C++-like approach to this
    situation. The returned "temporary" is not really a separate temporary
    at all, but actually `a` itself.

    Yeah.

    Meanwhile, in Clang all three pointers are different, i.e. Clang
    decided to actually create a separate temporary object for the result
    of the assignment.

    Which in my reading of the standard is required under C11 rules.
    I have reproduced your results under -std=c11 -pedantic, for both
    gcc and clang.


    Compilers don't have to follow the behaviour specified by the standard
    in a "direct translation" manner in order to be correct and conforming.
    They have to generate code that in the absence of any attempt to execute something with undefined behaviour, will give the same observable
    behaviour as a "direct translation" would.

    The result of the "(a = b)" expression should be a temporary object
    distinct from "a" and "b", with a lifetime extending only to the end of
    the expression assigning to "pc" (prior to C17).

    Is there any way to distinguish between "pc" pointing to an int inside
    this now dead temporary object, and it pointing to an int inside "a",
    without invoking undefined behaviour?

    By the time you are using "pc" to print it, the pointer itself has an indeterminate value - the compiler can quite happily give it the same
    value as "pa", so looking at the pointer in the printf() statement does
    not show a non-conformance.

    Attempting to modify the temporary lifetime object, such as by writing
    "*(pc = &(a = b).a[5]) = 42;", is undefined behaviour.

    It is entirely possible that there /is/ some way to determine that the
    compiler is not making a distinct temporary object while avoiding any
    undefined behaviour or indeterminate values. But I don't think the code
    here does show that - and it is therefore not an example of
    non-conforming behaviour. I think GCC and clang can be viewed as having
    simply picked different ways to generate their indeterminate values.

    I will be happy to change that opinion if someone has a better argument
    or example.


    I have a strong feeling that GCC's behavior is non-conforming. The
    last sentence of 6.2.4/8 is not supposed to permit "projecting" the
    resultant temporaries onto existing named objects. I could be wrong...

    My judgment is that the behavior under gcc is non-conforming if the compilation was done using C11 semantics. Under C17 or later rules
    the gcc behavior is allowed (and may have been what prompted the
    change in C17, but that is just speculation on my part). In any
    case I understand now what you were getting at. Thank you for
    bringing this hazard to the group's attention.

    I hope someone files a bug report for gcc using -std=c11 rules,
    because what gcc does under that setting (along with -pedantic)
    is surely at odds with the plain reading of the C11 standard,
    for the situation being discussed here.

    Editorial comment: here is yet another case where post-C11 changes
    to the C standard seem ill advised, and another reason not to use
    any version of the ISO C standard for C17 or later. And it's
    disappointing that gcc -std=c11 -pedantic strays into the realm of non-conforming behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Andrey Tarasevich on Mon May 5 20:20:38 2025
    On Mon, 5 May 2025 08:45:09 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 2:01 AM, Michael S wrote:
    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.


    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]


    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);



    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of
    conversion to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]


    Expressions by themselves a valid. But since there is no situation in
    which the value produced by expressions is valid outside of expressions
    the compiler can generate any value it wants, even NULL or value
    completely outside of address space of current process.

    - these by themselves are are perfectly valid. There's no UB in these expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.


    It's only not UB in the nazal demons sense.
    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
    it is form of UB.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Mon May 5 21:10:07 2025
    On 2025-05-05, Keith Thompson <[email protected]> wrote:
    Michael S <[email protected]> writes:
    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <[email protected]> wrote:
    And more obviously, "%p" requires an argument of type void*, not int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I disagree. I suggest it's a bad habit to use "%p" without ensuring,
    by a cast if necessary, that the argument is of type void*.

    In most implementations, it's likely that all pointers have the same
    size and representation and are passed as arguments in the same way,
    but getting the types right means one less thing to worry about.

    If the codebade assumes all data pointers are the same size, bit pattern
    and are treated the same in the calling conventions / ABI, then it
    is probably moot.

    That code is doomed on a platform where the assumption doesn't hold, and
    the printf statemnts are probably not independently reusable.

    (I mostly put in these casts just to communicate to others that
    an ISO C language lawyer works here, if you happen to need one.)

    Also, it owuld be amazingly stupid of any such platform not just
    make those printfs work: to promote variadic arguments of
    pointer-to-object type to a common representation which is the same as
    void *, combined with a matching behavior in the va_arg macro for
    extracting the value back into any pointer-to-object type.

    Mountains of non-standard-conforming code exert tremendous pressure on
    both hardware platforms and the way C implementations are adapted to
    those platforms.


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Mon May 5 17:04:06 2025
    Keith Thompson <[email protected]> writes:

    Andrey Tarasevich <[email protected]> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. [...]

    If you look again carefully, I expect you will reach a
    different conclusion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Scott Lurndal on Mon May 5 21:57:47 2025
    [email protected] (Scott Lurndal) writes:

    Keith Thompson <[email protected]> writes:

    James Kuyper <[email protected]> writes:

    On 5/3/25 20:37, Keith Thompson wrote:

    Lawrence D'Oliveiro <[email protected]d> writes:

    On Sat, 3 May 2025 01:14:46 -0700, Andrey Tarasevich wrote:

    Virtually every C project relies on assignment of structures.
    Passing-returning structs by value might be more rare (although
    perfectly valid and often appropriate too), but assignment...
    assignment is used by everyone everywhere without even giving
    it a second thought.

    There is a caveat, to do with alignment padding: will this
    always have a defined value?

    I don't believe so. In a quick look, I don't see anything in
    the standard that explicitly addresses this, but I believe that a
    conforming implementation could implement structure assignment by
    copying the individual members, leaving any padding in the target
    undefined.

    "When a value is stored in an object of structure or union type,
    including in a member object, the bytes of the object
    representation that correspond to any padding bytes take
    unspecified values.56)" (6.2.6.1p6).

    That refers to footnote 56, which says "Thus, for example,
    structure assignment need not copy any padding bits."

    Yes, that's what I missed.

    It's interesting that the footnote refers to padding *bits* rather
    than padding *bytes*. I presume this was unintentional.

    Padding bits:

    struct A {
    uint64_t tlen : 16,
    : 20,
    pkind : 6,
    fsz : 6,
    gsz : 14,
    g : 1,
    ptp : 1;
    } s;

    There are 20 padding bits in this declaration. Perhaps that's
    what they're referring to?

    To me it seems clear that the "padding bits" here is meant to refer
    to all of the following:

    unoccupied bytes between members, due to member alignment
    unoccupied bytes at the end of a structure or union
    bits corresponding to unnamed bit-field members
    unoccupied bits or bytes caused by explicit bit-field alignment
    unoccupied bits or bytes caused by other bit-field alignment

    Any member objects may have their own internal padding bits. Any
    assignment of a struct or union follows the usual rule that any
    padding bits that are part of a target member have unspecified
    values (as long as the member doesn't become a trap representation
    as a result).

    Considering all these parts together, I think it makes sense to say
    that the padding bits of an object are those bits that do not
    participate in determining the abstract value of the object (not
    counting that some combination of padding bits might cause the
    object to become a trap representation, which never happens for
    structs or unions).

    (Yes I know that the term "trap representation" has been changed in
    later versions of the C standard. Please make any needed editorial
    changes internally, without having to post a followup.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 21:25:16 2025
    Michael S <[email protected]> writes:

    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <[email protected]> wrote:

    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    I don't agree with this assessment. First, the given suggestion is
    a rule of thumb. By their nature rules of thumb offer heuristics
    that give guidelines likely to yield good results, but not
    guaranteed to do so. Second, a decision about whether to pass a
    struct object or a pointer to said object is often one that is a
    fair amount of work to undo, and so tends to be made early during
    the time period of program development. As such, it is useful to
    follow a guideline likely to give good results, even if not always
    optimal, because on average it will mean less work done overall.

    I second Richard Damon's recommendation, with the understanding that
    it is only a guideline, not an absolute, and as always subject to
    later revision should that turn out to be called for (no pun
    intended).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:40:57 2025
    Michael S <[email protected]> writes:

    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <[email protected]> wrote:

    And more obviously, "%p" requires an argument of type void*, not
    int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I don't have the same reaction. My sense is Keith was just being
    thorough. Speaking for myself his statement wasn't needed, but
    that condition might not hold for other readers. Given that his
    comment is just one not-overly-long sentence, I don't think it's
    too much to ask that readers already familiar with the point
    simply skip over it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Mon May 5 22:26:22 2025
    Michael S <[email protected]> writes:

    On Mon, 5 May 2025 08:45:09 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 2:01 AM, Michael S wrote:

    On Mon, 5 May 2025 01:29:47 -0700
    Andrey Tarasevich <[email protected]> wrote:

    On Mon 5/5/2025 1:12 AM, Michael S wrote:

    According to my understanding, you are wrong.
    Taking pointer of non-lvalue is UB, so anything compiler does is
    conforming.

    Er... What? What specifically do you mean by "taking pointers"?

    The whole functionality of `[]` operator in C is based on pointers.
    This expression

    (a = b).a[5]
    [...]
    is already doing your "taking pointers of non-lvalue" (if I
    understood you correctly) as part of array-to-pointer conversion.
    And no, it is not UB.

    This is not UB either

    struct S foo(void) { return (struct S) { 1, 2, 3 }; }
    ...
    int *p;
    p = &foo().a[2], printf("%d\n", *p);

    That is not UB:
    int a5 = (a = b).a[5];

    That is UB:
    int* pa5 = &(a = b).a[5];

    No, it isn't.

    If you read the post of Keith Thompson and it is still not clears to
    you then I can not help.

    The only valid "UB" claim in Keith's post is my printing the value of
    `pc` pointer, which by that time happens to point nowhere, since the
    lifetime of the temporary is over. (And, of course, lack of
    conversion to `void *` is an issue).

    As for the expressions like

    &(a = b).a[5];

    and

    &foo().a[2]

    Expressions by themselves a valid. But since there is no situation in
    which the value produced by expressions is valid outside of expressions
    the compiler can generate any value it wants, even NULL or value
    completely outside of address space of current process.

    These expressions produce valid values as long as they are used
    before the end of each full expression containing the given
    expression; within that context they may not produce NULL or a
    value outside of the program's address space.

    - these by themselves are are perfectly valid. There's no UB in these
    expressions. (And this is not a debate.)

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    It's only not UB in the nazal demons sense.
    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
    it is form of UB.

    The term used in the C standard is "unspecified behavior". If
    this kind of expression is something you don't want to use that
    is understandable, but it would help communication to use the
    appropriate standard-defined term to describe it.

    Essentially all non-trivial programs have unspecified behaviors, and
    plenty of them. Most are benign, some are problematic, but in no
    case does an unspecified behavior, by itself, represent a danger to
    program semantics as severe as executing a construct that has
    undefined behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Kaz Kylheku on Mon May 5 22:57:14 2025
    Kaz Kylheku <[email protected]> writes:

    On 2025-05-05, Keith Thompson <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Mon, 05 May 2025 01:34:16 -0700
    Keith Thompson <[email protected]> wrote:

    And more obviously, "%p" requires an argument of type void*, not
    int*.

    That part of otherwise very good comment is unreasonably pedantic.

    I disagree. I suggest it's a bad habit to use "%p" without
    ensuring, by a cast if necessary, that the argument is of type
    void*.

    In most implementations, it's likely that all pointers have the
    same size and representation and are passed as arguments in the
    same way, but getting the types right means one less thing to worry
    about.

    If the codebade assumes all data pointers are the same size, bit
    pattern and are treated the same in the calling conventions / ABI,
    then it is probably moot.

    That code is doomed on a platform where the assumption doesn't
    hold, and the printf statemnts are probably not independently
    reusable.

    (I mostly put in these casts just to communicate to others that
    an ISO C language lawyer works here, if you happen to need one.)

    Also, it owuld be amazingly stupid of any such platform not just
    make those printfs work: to promote variadic arguments of
    pointer-to-object type to a common representation which is the
    same as void *, combined with a matching behavior in the va_arg
    macro for extracting the value back into any pointer-to-object
    type.

    This statement strikes me as would an utterance coming from a
    resident of Fantasyland.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Tue May 6 07:16:24 2025
    On Mon, 05 May 2025 13:53:10 -0700
    Keith Thompson <[email protected]> wibbled:
    [email protected] writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to it,

    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member

    Depends what you're doing. Passing an array in a structure will copy the array saving you having to do it yourself if you don't want to work on the original version. Obviously that doesn't happen if you just pass a pointer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:46:21 2025
    On 05/05/2025 22:53, Keith Thompson wrote:
    [email protected] writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to it,
    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    I regularly use arrays with known fixed sizes. In fact, in my code
    those are absolutely dominant - it is very rare for me to see or use an
    array whose size is /not/ fixed at compile time. Sometimes I will have
    general functions that take parameters that are arrays of arbitrary
    length, but not often.

    So this is very much dependent on the kind of code you are working with,
    and other people will have very different experiences for their own code.

    However, I think it is not unlikely that people will see use of structs
    like :

    struct vector4int { int vs[4]; };

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Tue May 6 11:35:30 2025
    On 05/05/2025 22:27, Keith Thompson wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."


    It seems clear to me that "pc" has an indeterminate value after the
    expression assigning, since it points to an object with temporary lifetime.

    And attempting to use the value of an object with automatic storage
    while it has an indeterminate value is undefined behaviour.

    As far as I can see, simply reading the value in "pc" to print it out is
    UB according to the C standards. It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
    registers are more complicated than simple linear addresses - they can
    track some kind of segment structure describing the range of a data
    block, or permissions for access to the data, and such structures could
    have been deactivated or deallocated when the temporary lifetime object
    died. Even attempting to read the value of the pointer, without
    dereferencing it, would then cause some kind of fault or trap.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Tue May 6 10:18:44 2025
    On Tue, 6 May 2025 11:46:21 +0200
    David Brown <[email protected]> wibbled:
    On 05/05/2025 22:53, Keith Thompson wrote:
    [email protected] writes:
    [...]
    If you twant o pass an actual array to a function instead of a pointer to >it,
    embedding it in a structure is the only way to do it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    I regularly use arrays with known fixed sizes. In fact, in my code
    those are absolutely dominant - it is very rare for me to see or use an
    array whose size is /not/ fixed at compile time. Sometimes I will have

    I do a lot of networking code and with packet structures the arrays are
    almost always of fixed size. Also with arrays the data is inline so a simple memcpy() can copy the data from the struct to the output buffer. You can't
    do that if you have pointers in the struct. Ditto a simple cast to char * to use it directly as the ouput.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Tue May 6 16:34:56 2025
    On Mon, 05 May 2025 13:53:10 -0700
    Keith Thompson <[email protected]> wrote:

    [email protected] writes:
    [...]
    If you twant o pass an actual array to a function instead of a
    pointer to it, embedding it in a structure is the only way to do
    it.

    Yes, but that's not necessarily useful. An array that's a member
    of a struct can only be of a constant length (unless it's a flexible
    array member, but that doesn't help). Functions that work with
    arrays typically need to deal with arrays of arbitrary length.


    It seems, C++ authorities were feeling that the pattern "struct with
    array of constant length as an only member" is very common.
    Otherwise they wouldn't bother to add <array> to their standard library.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to Keith Thompson on Tue May 6 17:36:36 2025
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Waldek Hebisch on Tue May 6 20:46:48 2025
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Tue May 6 19:06:20 2025
    On Mon, 05 May 2025 13:43:31 -0700, Keith Thompson wrote:
    Tim Rentsch <[email protected]> writes:
    Keith Thompson <[email protected]> writes:
    Andrey Tarasevich <[email protected]> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5];
    pb = &b.a[5];
    pc = &(a = b).a[5];

    printf("%p %p %p\n", pa, pb, pc);
    }

    [...]

    I think that code has undefined behavior.

    Right. [*]
    [...]
    [*] Assuming C11 semantics. At best inadvisable under C99
    semantics, and a constraint violation under C90 semantics.

    What C90 constraint does it violate? Both gcc and clang reject it
    with "-std=c90 -pedantic-errors", with an error message "ISO C90
    forbids subscripting non-lvalue array", but I don't see a relevant
    constraint in the C90 standard.

    I don't know about C90, but in C89 the above code violates the
    constraint on the [] operator that "one of the expressions shall
    have type ``pointer to object type.''" (3.3.2.1, first paragraph)

    C89 (3.2.2.1, third paragraph) only describes conversion of lvalues with
    array type into pointers. No similar rule applies for an expression
    with array type which is not an lvalue, so such expressions are not
    converted to pointers.

    So, given:

    struct { int a[10]; } a, b;
    /* ... */
    (a = b).a[5];

    Since (a = b).a is not an lvalue, it is not converted to a pointer, so
    neither operand of [] has pointer type, so a diagnostic is required.

    I know that C11 introduced "temporary lifetime" to cover cases
    like this. In C99, the wording for the indexing operator implicitly
    assumes that there's an array object; if there isn't, I'd argue the
    behavior is undefined by omission. I'm not aware of any relevant
    change from C90 to C99.

    The rule about conversions from arrays to pointers is different in C99
    (n1124 6.3.2.1, third paragraph) compared to C89. In particular,
    "an lvalue that has type ``array of type'' ..." was changed to
    "an expression that has type ``array of type'' ...".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Tue May 6 19:22:34 2025
    David Brown <[email protected]> writes:
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    That would get an immediate downcheck during review for exactly
    that reason.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Wed May 7 09:37:57 2025
    On 06/05/2025 21:22, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 06/05/2025 19:36, Waldek Hebisch wrote:
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    I must admit I had not noticed that detail.

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course. In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or
    smart-arse coding. In all my C coding experience, I can't recall ever
    coming across a single situation when I thought the use of the comma
    operator was appropriate in the kind of code I work with.

    Other people, projects, and teams work with different standards,
    different requirements, and different kinds of code - there are a lot of
    things that are common practice in some C coding that are strongly
    rejected in my field (and perhaps vice versa). So I am not suggesting
    that the comma operator is always bad in C - just that it is pretty much
    always bad in my line of work.

    And of course Andrey was using it here to make a specific point in a
    discussion about C details, rather than real-life code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Wed May 7 19:09:40 2025
    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
    Nick Bowler <[email protected]> writes:
    The rule about conversions from arrays to pointers is different in C99
    (n1124 6.3.2.1, third paragraph) compared to C89. In particular,
    "an lvalue that has type ``array of type'' ..." was changed to
    "an expression that has type ``array of type'' ...".
    [...]
    The change from "lvalue" to "expression" was made in C99. I wonder why
    that was done.

    It's not mentioned in the rationale, so we can only guess. But it is
    called out in the list of major changes in the C99 foreword.

    BTW, you have a copy of ANSI C89? Hard or soft copy? Do you know if
    it's still available in some form?

    Hint: look for FIPS 160 on the NIST website. This is the same standard
    as ANSI X3.159-1989 Programming Language - C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Nick Bowler on Wed May 7 21:17:17 2025
    Nick Bowler <[email protected]> writes:

    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:

    Nick Bowler <[email protected]> writes:

    The rule about conversions from arrays to pointers is different
    in C99 (n1124 6.3.2.1, third paragraph) compared to C89. In
    particular, "an lvalue that has type ``array of type'' ..." was
    changed to "an expression that has type ``array of type'' ...".

    [...]

    The change from "lvalue" to "expression" was made in C99. I
    wonder why that was done.

    It's not mentioned in the rationale, so we can only guess. [...]

    To me it seems obvious. The change in C99 was meant to allow
    access to an array inside a non-lvalue struct. When C99 was
    done the committee didn't realize all the ramifications of
    accessing non-value structs (which apparently has problems
    even for scalar members, not just array members). Later, when
    they did realize the resulting problems, they fixed things up
    in C11.

    See also n1253.htm, by Clark Nelson.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nick Bowler@21:1/5 to Keith Thompson on Thu May 8 12:58:56 2025
    On Wed, 07 May 2025 14:23:57 -0700, Keith Thompson wrote:
    Nick Bowler <[email protected]> writes:
    On Tue, 06 May 2025 13:21:38 -0700, Keith Thompson wrote:
    The change from "lvalue" to "expression" was made in C99. I wonder why
    that was done.

    It's not mentioned in the rationale, so we can only guess. But it is
    called out in the list of major changes in the C99 foreword.

    I've just looked at the foreword of the C99 standard and the n1256
    draft, and I couldn't find it. Can you quote the precise wording?

    N1256 page xiii. Fourth to last bullet point:

    "-- conversion of array to pointer not limited to lvalues"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Thu May 8 12:45:58 2025
    Andrey Tarasevich <[email protected]> writes:

    On Mon 5/5/2025 1:26 AM, Keith Thompson wrote:

    I wondering what the last sentence is intended to mean ("... need not
    have a unique address"). At the first sight, the intent seems to be
    obvious: it simply says that such temporary objects might repeatedly
    appear (and disappear) at the same location in storage, which is a
    natural thing to expect.

    You snipped this: "Any attempt to modify an object with temporary
    lifetime results in undefined behavior.". Which means, I think,
    that an implementation that shared storage for "such an object"
    with something else probably isn't going to cause problems for any
    code with defined behavior.

    It is going to cause problems, if the code relies on the address
    identity of the object, assuming the standard intends to provide such guarantees.

    Though I can imagine the possibility of code that modifies `a` and
    reads via `pc` within the same full expression.

    That's easy (in the context of declarations from my previous example):

    pc = &(a = b).a[5], a.a[5] = 42, printf("%d\n", *pc);

    As one would expect, this produces different output in GCC and Clang
    for the reasons I already described.

    But unless I've somehow missed it, the "Such an object need not
    have a unique address." wording doesn't appear on that web page or
    in my copy of n1570.pdf. C17 does add these two sentences:

    An object with temporary lifetime behaves as if it were declared
    with the type of its value for the purposes of effective type. Such
    an object need not have a unique address.

    Normally any two objects with overlapping lifetime must have distinct
    addresses. This addition, I think, gives compilers permission to have
    temporary lifetime objects overlap with other existing objects, but not
    to have a modification to one object affect the value of the other
    (unless the modification invokes UB, of course).

    If so, that would be extremely underspecified. A mere "such an object
    need not have a unique address" is insufficient to fully convey the permission to overlap existing named objects.

    I don't see why you say that. The statement says objects with
    temporary lifetime need not have a unique address. In the absence
    of any other statement on the subject, this statement admits the
    inference that an object with temporary lifetime might have the same
    address as any other object. Removing the constraint (that the
    addresses of those objects must be distinct from the addresses
    of all other objects), /and doing nothing else/, can only mean that
    the addresses of such objects might match the address of any other
    object in the environment.

    If you think there should be a non-normative footnote explaining
    that point, I expect I would vote in favor of that, but as far
    as normative text goes I don't see any fuzziness about what is
    allowed under the existing wording.

    And that's probably what led to difference in interpretation
    between GCC and Clang.

    I suspect the implication actually goes the other way. It is
    because what gcc has done (past tense) violates the rules of the C11
    standard that someone had the bright idea that the C standard should
    be changed to allow this stupidity.

    Modification of the temporary is "prohibited" (as UB), but
    modification of the overlapped named object is not. The
    consequences can be quite surprising.

    In my view the problem is not that what is allowed is unclear, but
    that the whole idea of possibly overlapping objects is a crock.
    It's a sad statement on the quality of gcc that it does the wrong
    thing even when -std=c11 and -pedantic are given as compilation
    options. Bleah.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Tim Rentsch on Thu May 8 22:20:02 2025
    On 08/05/2025 21:45, Tim Rentsch wrote:
    Andrey Tarasevich <[email protected]> writes:



    And that's probably what led to difference in interpretation
    between GCC and Clang.

    I suspect the implication actually goes the other way. It is
    because what gcc has done (past tense) violates the rules of the C11
    standard that someone had the bright idea that the C standard should
    be changed to allow this stupidity.

    Modification of the temporary is "prohibited" (as UB), but
    modification of the overlapped named object is not. The
    consequences can be quite surprising.

    In my view the problem is not that what is allowed is unclear, but
    that the whole idea of possibly overlapping objects is a crock.
    It's a sad statement on the quality of gcc that it does the wrong
    thing even when -std=c11 and -pedantic are given as compilation
    options. Bleah.

    While I think it is important that compilers try to follow the C
    standards (at least when you specify conforming modes), are there any
    potential realistic consequences of this?

    Posters here have gone far out of their way to make hypothetical code
    that demonstrates this flaw in gcc without invoking undefined behaviour.
    Is there any risk that anyone would come across this in real code?

    In addition, is it reasonable to suppose that C programmers that have
    not studied the C standards here would be expecting the behaviour of
    gcc, or the behaviour of clang here? Certainly if /I/ saw "pc = &(a =
    b).a[5]" prior to this thread, I would expect the contents of the struct
    "b" to be copied to the memory of the struct "a", and "pc" set to point
    to the member of the array within "a". I would expect the code to work
    as gcc works, and would find clang's behaviour completely unexpected. I
    would be surprised if I were alone in that.

    So to me, it makes sense that the C standard has changed to support a
    more sane approach to such situations. It would be unreasonable to
    change it to guarantee the sensible behaviour - that would mean
    compilers like clang that generated technically correct but surprising
    (to many) code would now be wrong.

    (gcc's behaviour is also more efficient, but of course correctness
    trumps efficiency every time.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rosario19@21:1/5 to Michael S on Mon May 12 11:23:02 2025
    On Sun, 4 May 2025 11:01:17 +0300, Michael S wrote:

    On Sat, 3 May 2025 21:42:37 -0400
    Richard Damon <[email protected]> wrote:


    Bigger than that, and you likely want to pass the object by address,
    not by value, passing just a pointer to it.

    That sort of thinking is an example of Knutian premature optimization.

    i prefer pass memory (if it is big enought) with one address or
    reference

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Michael S on Thu May 29 05:11:01 2025
    On Mon 5/5/2025 10:20 AM, Michael S wrote:

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.


    It's only not UB in the nazal demons sense.
    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). I.e. pc is completely useless. In my book
    it is form of UB.

    Whether we can or cannot predict the values of `(pa==pc)` and `(pb==pc)`
    has very little impact on the usability of such expressions. The
    practical usability of such expressions is very high without relying on `(pa==pc)` and `(pb==pc)`.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Keith Thompson on Thu May 29 05:14:37 2025
    On Mon 5/5/2025 1:27 PM, Keith Thompson wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    Nope. Nowhere in this code the value of `pc` is used beyond the lifetime
    of the object with temporary lifetime.

    Pay attention to the fact that the last 4 lines in above code is a
    single expression joined by a comma operator, which is the whole point
    of the corrections that differentiate it from the original version.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Not applicable in this case.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Waldek Hebisch on Thu May 29 05:21:04 2025
    On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    Exactly. I thought the nature of the corrections I made (i.e. the
    deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to David Brown on Thu May 29 05:19:13 2025
    On Tue 5/6/2025 2:35 AM, David Brown wrote:

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."


    It seems clear to me that "pc" has an indeterminate value after the expression assigning, since it points to an object with temporary lifetime.

    And attempting to use the value of an object with automatic storage
    while it has an indeterminate value is undefined behaviour.

    Again, this makes no sense. Please, pay attention to the code and the corrections made after the initial version (e.g. usage of comma
    operator). No, the value of `pc` is not indeterminate, and no, there's
    no undefined behavior in the above version of the code.

    As far as I can see, simply reading the value in "pc" to print it out is
    UB according to the C standards.  It is clearly going to be a harmless operation on most hardware, but there are processors where pointer
    registers are more complicated than simple linear addresses - they can
    track some kind of segment structure describing the range of a data
    block, or permissions for access to the data, and such structures could
    have been deactivated or deallocated when the temporary lifetime object died.  Even attempting to read the value of the pointer, without dereferencing it, would then cause some kind of fault or trap.

    Again, irrelevant. In the above code the temporary object does not die
    during the entire period when pointer `pc` is used in any way.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to Keith Thompson on Thu May 29 05:36:34 2025
    On Mon 5/5/2025 1:43 PM, Keith Thompson wrote:

    What C90 constraint does it violate? Both gcc and clang reject it
    with "-std=c90 -pedantic-errors", with an error message "ISO C90
    forbids subscripting non-lvalue array", but I don't see a relevant
    constraint in the C90 standard.


    The "constraint" in C89/90 is simply the fact that C89/90 _requires_ an
    lvalue (of array type) in order to apply array to pointer conversion.
    Here's is the original wording:

    Except when it is the operand of the sizeof operator or the unary & operator, or is a character string literal used to initialize an array
    of character type, or is a wide string literal used to initialize an
    array with element type compatible with wchar-t, an *lvalue* that has
    type “array of type” is converted to an expression that has type
    “pointer to rype” that points to the initial element of the array object and is not an lvalue.

    The presence of that "*lvalue*" requirement is what prevented up from
    using `[]` operator on non-lvalue arrays in C89/90, because `[]`
    critically relies on that conversion.

    In C11 the wording has changed:

    Except when it is the operand of the sizeof operator, the _Alignof
    operator, or the unary & operator, or is a string literal used to
    initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the
    array object has register storage class, the behavior is undefined.

    Note that the "lvalue" requirement has disappeared from this wording.
    That is exactly why since C99 we can apply `[]` to non-lvalue arrays.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Tarasevich@21:1/5 to David Brown on Thu May 29 05:49:00 2025
    On Wed 5/7/2025 12:37 AM, David Brown wrote:

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course.  In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was done maliciously, intentionally deceptively, due to incompetence, or smart-
    arse coding.  In all my C coding experience, I can't recall ever coming across a single situation when I thought the use of the comma operator
    was appropriate in the kind of code I work with.

    Wow! That's catastrophically bad.

    As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
    programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
    important programming paradigm in these languages. The fact that
    elaborate expression-level programming is not in nay way abandoned or
    shunned today is pretty obvious in C++, since C++ took major steps
    lately to develop its expression-level capabilities. But it has always
    been and will always remain important in C as well.

    The proclivity to stick exclusively to statement-level programming in C
    and, God forbid, impose it in others through so called "code reviews"...
    that would be a trait specific to "sweatshop" development outfits, which
    strive to replace quality with quantity. I'd agree that in a revolving
    door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
    why we should set our standards that low here, in `comp.lang.c`.

    --
    Best regards,
    Andrey

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Andrey Tarasevich on Thu May 29 16:43:24 2025
    On 29.05.2025 14:21, Andrey Tarasevich wrote:
    On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:
    Keith Thompson <[email protected]> wrote:
    Andrey Tarasevich <[email protected]> writes:
    [...]
    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.


    Exactly. I thought the nature of the corrections I made (i.e. the
    deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

    I wouldn't call it "strikingly obvious". Typically programmers have a abstracting look at code, and if they're used to semicolon separated
    commands the small difference between ';' and ',' may get missed (as
    some replies also indicated). Myself, as I recall from that older post,
    I did also miss it on the first glimpse. But only and after I asked
    myself what the _intentions_ the post had been I had a closer look at
    all the inconspicuous "details" and noticed the subtle difference.

    I don't think there's anything wrong with it, to be sure. But if I
    were to post such subtle differences I'd have added a comment or used
    a formatting to point and hint to that subtle but crucial difference.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Andrey Tarasevich on Thu May 29 16:33:01 2025
    On 29.05.2025 14:49, Andrey Tarasevich wrote:
    On Wed 5/7/2025 12:37 AM, David Brown wrote:

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course. In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was
    done maliciously, intentionally deceptively, due to incompetence, or
    smart- arse coding. In all my C coding experience, I can't recall
    ever coming across a single situation when I thought the use of the
    comma operator was appropriate in the kind of code I work with.

    Wow! That's catastrophically bad.

    As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
    programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
    important programming paradigm in these languages. The fact that
    elaborate expression-level programming is not in nay way abandoned or
    shunned today is pretty obvious in C++, since C++ took major steps
    lately to develop its expression-level capabilities. But it has always
    been and will always remain important in C as well.

    The proclivity to stick exclusively to statement-level programming in C
    and, God forbid, impose it in others through so called "code reviews"...
    that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
    door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
    why we should set our standards that low here, in `comp.lang.c`.

    Well said.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Michael S on Thu May 29 12:57:06 2025
    On Mon 5/5/2025 10:20 AM, Michael S wrote:

    Here's a version of the same code that corrects the above distracting
    issues

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.


    It's only not UB in the nazal demons sense.

    UB is UB only in the nasal demons sense. UB means that "This
    international standard imposes no requirements on the behavior". That
    is, anything could happen, at least as far as the standard is concerned.
    Code with undefined behavior should not be able to produce nasal demons
    because there's no such thing as nasal demons (I think). However, if
    they did exist, producing them would not violate any requirements
    imposed by the standard, because it quite explicitly imposes none on
    such code.

    It's UB in a sense that we can't predict values of expressions
    like (pa==pc) and (pb==pc). ...

    Why not? Because of the comma operators, the lifetime of the temporary
    extends all the way till the end of the printf() call, long enough to
    make use of pc in that call safe.

    ... I.e. pc is completely useless. In my book
    it is form of UB.

    If the problem were only that there's no restrictions on the value of an expression, but that the code is otherwise safe to use, that would be
    indicated by a much weaker term: "unspecified value". Calling it "a
    form of UB" would serve no useful purpose.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andrey Tarasevich on Thu May 29 21:05:37 2025
    On 29/05/2025 14:19, Andrey Tarasevich wrote:
    On Tue 5/6/2025 2:35 AM, David Brown wrote:

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."


    It seems clear to me that "pc" has an indeterminate value after the
    expression assigning, since it points to an object with temporary
    lifetime.

    And attempting to use the value of an object with automatic storage
    while it has an indeterminate value is undefined behaviour.

    Again, this makes no sense. Please, pay attention to the code and the corrections made after the initial version (e.g. usage of comma
    operator). No, the value of `pc` is not indeterminate, and no, there's
    no undefined behavior in the above version of the code.

    As far as I can see, simply reading the value in "pc" to print it out
    is UB according to the C standards.  It is clearly going to be a
    harmless operation on most hardware, but there are processors where
    pointer registers are more complicated than simple linear addresses -
    they can track some kind of segment structure describing the range of
    a data block, or permissions for access to the data, and such
    structures could have been deactivated or deallocated when the
    temporary lifetime object died.  Even attempting to read the value of
    the pointer, without dereferencing it, would then cause some kind of
    fault or trap.

    Again, irrelevant. In the above code the temporary object does not die
    during the entire period when pointer `pc` is used in any way.


    I posted later that I had made a mistake, and not noticed the use of the
    comma instead of a semicolon.

    Why are you dredging up an outdated thread? I made an error three weeks
    ago, and realised the mistake shortly afterwards. What do you expect me
    or anyone else to learn from that now, after all this time?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andrey Tarasevich on Thu May 29 21:20:59 2025
    On 29/05/2025 14:49, Andrey Tarasevich wrote:
    On Wed 5/7/2025 12:37 AM, David Brown wrote:

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course.  In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was
    done maliciously, intentionally deceptively, due to incompetence, or
    smart- arse coding.  In all my C coding experience, I can't recall
    ever coming across a single situation when I thought the use of the
    comma operator was appropriate in the kind of code I work with.

    Wow! That's catastrophically bad.

    As it has been stated many times before, both C and C++ are programming languages that embrace both statement-level and expression-level
    programming. Expression-level programming (e.g. where `?:` is used for branching and `,` for sequencing) is a very valuable and massively
    important programming paradigm in these languages. The fact that
    elaborate expression-level programming is not in nay way abandoned or
    shunned today is pretty obvious in C++, since C++ took major steps
    lately to develop its expression-level capabilities. But it has always
    been and will always remain important in C as well.

    No, expression-level programming has always been and will likely always
    remain a very minor part of C programming. Yes, some people make use of
    the comma operator. Some people do so extensively - and they are often,
    but not necessarily, considered "smart-arse" programmers rather than
    "smart" programmers. If the comma operator were removed from the C
    language, I guess some 95% of programmers would barely notice - at
    worst, they would have to add an extra line inside an occasional "for"
    loop. (The tertiary operator is used much more.)

    I did not say that the use of comma operators is always bad - I said I
    do not recall seeing it in the kind of code I work with in a situation
    where I thought it was a good way to write the code. A significant part
    of that is the kind of code I work with - in code for small systems
    where high reliability and safety is vital, code clarity is of utmost importance. Code that does not do what it first appears to do is
    severely frowned upon. Code is written in a very imperative style.

    In my world, code that uses "malloc" is rarely acceptable, and for most programs, "double" is very seldom an appropriate choice of type. But
    that does not mean these are not usable for other kinds of C
    programming. There are many reasons why different styles of coding are
    used in different circumstances.

    Even when C++ is used, with its significantly broader support for a
    variety of programming paradigms, I do not recall seeing the comma
    operator used.


    The proclivity to stick exclusively to statement-level programming in C
    and, God forbid, impose it in others through so called "code reviews"...
    that would be a trait specific to "sweatshop" development outfits, which strive to replace quality with quantity. I'd agree that in a revolving
    door employment environment relying on a large number of low-competence developers such code might be seen as "too confusing". But I don't see
    why we should set our standards that low here, in `comp.lang.c`.


    I don't quite see how you are in any position to judge the coding styles
    used by people you know nothing about, working in fields that you know
    nothing about.

    I am happy that different types of programming styles and paradigms are
    used for different purposes - imperative C is not suitable for most
    coding tasks. Equally, expression-style programming is not appropriate
    for all coding tasks.

    However, one thing that is never suitable for any real-world programming
    is deceptive code that is not what it appears to be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Thu May 29 21:15:56 2025
    David Brown <[email protected]> writes:
    On 29/05/2025 14:49, Andrey Tarasevich wrote:
    On Wed 5/7/2025 12:37 AM, David Brown wrote:

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course.  In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was
    done maliciously, intentionally deceptively, due to incompetence, or
    smart- arse coding.  In all my C coding experience, I can't recall
    ever coming across a single situation when I thought the use of the
    comma operator was appropriate in the kind of code I work with.

    Wow! That's catastrophically bad.

    As it has been stated many times before, both C and C++ are programming
    languages that embrace both statement-level and expression-level
    programming. Expression-level programming (e.g. where `?:` is used for
    branching and `,` for sequencing) is a very valuable and massively
    important programming paradigm in these languages. The fact that
    elaborate expression-level programming is not in nay way abandoned or
    shunned today is pretty obvious in C++, since C++ took major steps
    lately to develop its expression-level capabilities. But it has always
    been and will always remain important in C as well.

    No, expression-level programming has always been and will likely always >remain a very minor part of C programming. Yes, some people make use of
    the comma operator. Some people do so extensively - and they are often,
    but not necessarily, considered "smart-arse" programmers rather than
    "smart" programmers. If the comma operator were removed from the C
    language, I guess some 95% of programmers would barely notice - at
    worst, they would have to add an extra line inside an occasional "for"
    loop. (The tertiary operator is used much more.)

    And sometimes, excessive use of the comma operator causes
    compiler failures.

    cfront generated the comma operator extensively, and expression trees
    would grow to very large sizes. There was a bug in PCC (for the
    88100) where it would run out of temporary registers while generating
    code for some cfront generated comma expressions (which were -far- from
    human readable). I had to fix the temporary register allocation
    code in PCC to spill registers when the sethi-ullman number for an
    expression exceeded the number of registers.

    That was circa 1990, and I've generally not found any arguments
    favoring their general use persuasive in the years since, including
    Andrey's and Kaz's responses recently posted here.

    The simple fact that experienced programmers that read this usenet
    newsgroup missed the comma operators in the original example speaks
    volumes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Fri May 30 10:50:56 2025
    On 29/05/2025 23:15, Scott Lurndal wrote:
    David Brown <[email protected]> writes:
    On 29/05/2025 14:49, Andrey Tarasevich wrote:
    On Wed 5/7/2025 12:37 AM, David Brown wrote:

    That would get an immediate downcheck during review for exactly
    that reason.

    Of course.  In fact, if someone presented such code for review (and
    assuming I noticed the commas!) I'd have to consider whether it was
    done maliciously, intentionally deceptively, due to incompetence, or
    smart- arse coding.  In all my C coding experience, I can't recall
    ever coming across a single situation when I thought the use of the
    comma operator was appropriate in the kind of code I work with.

    Wow! That's catastrophically bad.

    As it has been stated many times before, both C and C++ are programming
    languages that embrace both statement-level and expression-level
    programming. Expression-level programming (e.g. where `?:` is used for
    branching and `,` for sequencing) is a very valuable and massively
    important programming paradigm in these languages. The fact that
    elaborate expression-level programming is not in nay way abandoned or
    shunned today is pretty obvious in C++, since C++ took major steps
    lately to develop its expression-level capabilities. But it has always
    been and will always remain important in C as well.

    No, expression-level programming has always been and will likely always
    remain a very minor part of C programming. Yes, some people make use of
    the comma operator. Some people do so extensively - and they are often,
    but not necessarily, considered "smart-arse" programmers rather than
    "smart" programmers. If the comma operator were removed from the C
    language, I guess some 95% of programmers would barely notice - at
    worst, they would have to add an extra line inside an occasional "for"
    loop. (The tertiary operator is used much more.)

    And sometimes, excessive use of the comma operator causes
    compiler failures.

    That is also an issue in the world of small-systems embedded programming.

    While a lot of it these days is on ARM, and most of that is done using
    gcc, there are hundreds of C compilers of varying quality (and price,
    which is no indication of quality) for embedded systems. Many of these
    other toolchains have bugs, non-conformities, inconsistencies and
    weaknesses. (gcc is not perfect either!) People programming for 8-bit
    and 16-bit microcontrollers using such tools will - and should - use a conservative subset of the C language. Obscure and rarely used features
    of the language will be avoided, and code will be written in a simpler
    manner. You don't write code in a way that increases the risk of
    catching a bug in a poorly tested part of the compiler, or in a way that
    might lead to unexpectedly inefficient results.

    With 32-bit ARM now dominating the industry, along with more reliable
    tools (primarily gcc, with clang a distant second), there is much less
    need to pander to flaws in the toolchain but you still need to consider weaknesses in humans. When you are writing software that is intended to
    run continuously without problems for years, or where a misunderstanding
    by a new maintainer a decade later can lead to safety risks, you don't
    write "cool" code!

    It is well known that "Debugging is twice as hard as writing the code in
    the first place. Therefore, if you write the code as cleverly as
    possible, you are, by definition, not smart enough to debug it." I
    would also say that understanding and maintaining other people's code is
    often a lot more than twice as hard as writing the code yourself. I aim
    to code accordingly.


    cfront generated the comma operator extensively, and expression trees
    would grow to very large sizes. There was a bug in PCC (for the
    88100) where it would run out of temporary registers while generating
    code for some cfront generated comma expressions (which were -far- from
    human readable). I had to fix the temporary register allocation
    code in PCC to spill registers when the sethi-ullman number for an
    expression exceeded the number of registers.

    That was circa 1990, and I've generally not found any arguments
    favoring their general use persuasive in the years since, including
    Andrey's and Kaz's responses recently posted here.

    The simple fact that experienced programmers that read this usenet
    newsgroup missed the comma operators in the original example speaks
    volumes.


    Of course people are experienced in different things - "programming",
    even when limited to a single language, is a broad field.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Fri May 30 14:29:22 2025
    Keith Thompson <[email protected]> writes:
    [email protected] (Scott Lurndal) writes:
    [...]
    And sometimes, excessive use of the comma operator causes
    compiler failures.

    cfront generated the comma operator extensively, and expression trees
    would grow to very large sizes. There was a bug in PCC (for the
    88100) where it would run out of temporary registers while generating
    code for some cfront generated comma expressions (which were -far- from
    human readable). I had to fix the temporary register allocation
    code in PCC to spill registers when the sethi-ullman number for an
    expression exceeded the number of registers.

    That was circa 1990, and I've generally not found any arguments
    favoring their general use persuasive in the years since, including
    Andrey's and Kaz's responses recently posted here.

    So a compiler you used circa 1990 had problems with comma expressions.

    That's hardly an argument against using comma operators today.

    Tru, was an anecdote, not an argument, which conditioned
    my opinion of comma operators.


    The simple fact that experienced programmers that read this usenet
    newsgroup missed the comma operators in the original example speaks
    volumes.

    *That's* a valid argument.

    Indeed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Andrey Tarasevich on Fri Jun 6 17:44:14 2025
    Andrey Tarasevich <[email protected]> writes:

    On Tue 5/6/2025 10:36 AM, Waldek Hebisch wrote:

    Keith Thompson <[email protected]> wrote:

    Andrey Tarasevich <[email protected]> writes:
    [...]

    #include <stdio.h>

    struct S { int a[10]; };

    int main()
    {
    struct S a, b = { 0 };
    int *pa, *pb, *pc;

    pa = &a.a[5],
    pb = &b.a[5],
    pc = &(a = b).a[5],
    printf("%p %p %p\n", (void *) pa, (void *) pb, (void *) pc);
    }

    This version has no UB.

    I believe it does. pc points to an element of an object with
    temporary lifetime. The value of pc is then used after the object
    it points to has reached the end of its lifetime. At that point,
    pc has an indeterminate value.

    N3096 6.2.4p2: "If a pointer value is used in an evaluation after
    the object the pointer points to (or just past) reaches the end of
    its lifetime, the behavior is undefined. The representation of a
    pointer object becomes indeterminate when the object the pointer
    points to (or just past) reaches the end of its lifetime."

    Note commas above. Assignment to pc and call to printf are parts
    of a single expression, so use of pc is within lifetime of the
    temporary object.

    Exactly. I thought the nature of the corrections I made (i.e. the
    deliberate usage of comma operator) would be strikingly obvious to the participants of the thread. But alas...

    My own reaction is that the changes were not by themselves
    strikingly obvious. But in combination with the explicit
    statement that "This version has no UB" it seems obvious
    enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)