• Re: "The provenance memory model for C", by Jens Gustedt

    From Kaz Kylheku@21:1/5 to Alexis on Wed Jul 2 13:10:05 2025
    On 2025-07-02, Alexis <[email protected]> wrote:

    "This work has finally resulted in the publication of an international standard, Technical Specification ISO/IEC TS 6010 (edited by Henry
    Kleynhans, Bloomberg, UK) ...

    OMG, it's a completely idiotic document. What it is is a kind of patch
    against a specific version of ISO C, written in plain language rather
    than in diff format. Like "replace this paragraph with this one, add
    this sentence after that one, ...".

    What the actual fuck? How will that be maintainable going forward, first
    of all.

    You can't follow what this is without applying the patch: obtaining
    the exact ISO C standard that it targets and performing the edits.

    Almost nobody is going to do that.

    Right off the bat I spotted pointless shit in it that has nothing to do
    with provenance:

    6.4.5 Equality operators

    1 In section 6.5.9 Equality operators, add the following after the rst
    sentence of paragraph 3:

    2 None of the operands shall be an invalid pointer value.

    I don't have confidence in an author's understanding of C, if they
    believe that ISO C defines the behavior of invalid pointers being
    compared, such that this needs to be rectified by a private "patch"
    of the text.

    The concept of pointer provenance can be expressed other than
    as a textual patch against ISO C.

    It can be regarded as a language extension and documented similarly
    to how a sane compiler documentor would do it.

    "In this article, I will try to explain what this is all about, namely
    on how a provenance model for pointers interferes with alias analysis of modern compilers.

    Well, no shit; provenance is often dynamic; whereas aliasing analysis
    wants to be static.

    For those that are not fluent with the terminology or
    the concept we have a short intro what pointer aliasing is all about, a review of existing tools to help the compiler and inherent difficulties
    and then the proposed model itself. At the end there is a brief takeaway
    that explains how to generally avoid complications and loss of
    optimization opportunities that could result from mis-guided aliasing analysis."

    If you think that certain code could go faster because certain suspected aliasing isn't actually taking place, then since C99 you were able to
    spin the roulette wheel and use "restrict".

    So the aliasing analysis and its missed opportunities are the
    programmer's responsibility.

    It's always better for the machine to miss opportunities than to miss
    compile. :)

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to BGB on Wed Jul 9 11:41:28 2025
    On 09/07/2025 04:39, BGB wrote:
    On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
    On 2025-07-02, Alexis <[email protected]> wrote:

    ...


    I don't have confidence in an author's understanding of C, if they
    believe that ISO C defines the behavior of invalid pointers being
    compared, such that this needs to be rectified by a private "patch"
    of the text.


    You might not be aware of it, but the author Jens Gustedt is a member of
    the C standards committee, and has been for some time. He is the most
    vocal, public and active member. I think that suggests he has quite a
    good understanding of C and the ISO standards! Not everyone agrees
    about his ideas and suggestions about how to move C forward - but that's
    fine (and it's fine by Jens, from what I have read). That's why there
    is a standards committee, with voting, rather than a BDFL.

    The concept of pointer provenance can be expressed other than
    as a textual patch against ISO C.


    There have been plenty of papers and blogs written about pointer
    provenance (several by Gustedt) and how it could work. It's not a very
    easy thing to follow in any format. A patch to current C standards is
    perhaps the least easy to follow, but it is important for how the
    concept could be added to C.

    It can be regarded as a language extension and documented similarly
    to how a sane compiler documentor would do it.

    "In this article, I will try to explain what this is all about, namely
    on how a provenance model for pointers interferes with alias analysis of >>> modern compilers.

    Well, no shit; provenance is often dynamic; whereas aliasing analysis
    wants to be static.

    For those that are not fluent with the terminology or
    the concept we have a short intro what pointer aliasing is all about, a
    review of existing tools to help the compiler and inherent difficulties
    and then the proposed model itself. At the end there is a brief takeaway >>> that explains how to generally avoid complications and loss of
    optimization opportunities that could result from mis-guided aliasing
    analysis."

    If you think that certain code could go faster because certain suspected
    aliasing isn't actually taking place, then since C99 you were able to
    spin the roulette wheel and use "restrict".


    "restrict" can certainly be useful in some cases. There are also dozens
    of compiler extensions (such as gcc attributes) for giving the compiler
    extra information about aliasing.

    So the aliasing analysis and its missed opportunities are the
    programmer's responsibility.

    It's always better for the machine to miss opportunities than to miss
    compile. :)


    Agreed.

    It is always better for the toolchain to be able to optimise
    automatically than to require manual intervention by the programmer.
    (It should go without saying that optimisations are only valid if they
    do not affect the observable behaviour of correct code.) Programmers
    are notoriously bad at figuring out what will affect their code
    efficiency, and will either under-use "restrict" where it could clearly
    be safely used to speed up code, or over-use it resulting in risky code.

    If the compiler can't be sure that accesses don't alias, then of course
    it should assume that aliasing is possible.

    The idea of pointer provenance is to let compilers (and programmers!)
    have a better understanding of when accesses are guaranteed to be
    alias-free, when they are guaranteed to be aliasing, and when there are
    no guarantees. This is useful for optimisation and program analysis
    (including static error checking). The more information the compiler
    has, the better.


    In my compiler, the default was to use a fairly conservative aliasing strategy.

    ...
    With pointer operations, all stores can be assumed potentially aliasing unless restrict is used, regardless of type.


    C does not require that. And it is rare in practice, IME, for code to
    actually need to access the same data through different lvalue types
    (other than unsigned char). It is rarer still for it not to be handled
    better using type-punning unions or memcpy() - assuming the compiler
    handles memcpy() decently.

    Equally, this means that using type-based alias analysis generally gives
    only small efficiency benefits in C code (but more in C++). The
    majority of situations where alias analysis and a compiler knowledge of
    no aliasing (or always aliasing) would make a difference, are between
    pointers or other lvalues of compatible types. That is why provenance
    tracking can have potentially significant benefits.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to BGB on Thu Jul 10 11:34:26 2025
    On 10/07/2025 04:28, BGB wrote:
    On 7/9/2025 4:41 AM, David Brown wrote:
    On 09/07/2025 04:39, BGB wrote:
    On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
    On 2025-07-02, Alexis <[email protected]> wrote:

    ...

    There have been plenty of papers and blogs written about pointer
    provenance (several by Gustedt) and how it could work.  It's not a
    very easy thing to follow in any format.  A patch to current C
    standards is perhaps the least easy to follow, but it is important for
    how the concept could be added to C.


    Admittedly, as of yet, I haven't quite figured out what exactly
    provenance is supposed to be, or how it is supposed to work in practice.


    I've read a bit, but I think it would take quite an effort to understand
    the details.

    As a compiler user (albeit one with an interest in compilers and code generation), rather than a compiler developer, my attitude to writing C
    code will be the same if and when pointer provenance becomes part of the
    C model and C compiler optimisations - don't lie to your compiler. If
    you want to do weird stuff behind the compiler's back (and that is
    certainly possible in embedded development), use "volatile" accesses in
    the right places. So for me, in practical use, pointer provenance will
    simply mean that the compiler can do a bit more optimisation with less
    manual work - and that's a nice thing. (I'll still be interested in how
    it works, but that's for fun, not for real work.)


    If you think that certain code could go faster because certain
    suspected
    aliasing isn't actually taking place, then since C99 you were able to
    spin the roulette wheel and use "restrict".


    "restrict" can certainly be useful in some cases.  There are also
    dozens of compiler extensions (such as gcc attributes) for giving the
    compiler extra information about aliasing.


    And, the annoyance of them being compiler dependent...

    Sure. "restrict" is, of course, not compiler dependent - but the effect
    it has on optimisation is compiler dependent.

    Often you can also get improved results by manually "caching" data in
    local variables, instead of using pointer or array access directly, thus avoiding any extra memory accesses the compiler has to put in just in
    case pointers alias. But code is neater if you don't have to do that
    kind of thing.



    So the aliasing analysis and its missed opportunities are the
    programmer's responsibility.

    It's always better for the machine to miss opportunities than to miss
    compile. :)


    Agreed.

    It is always better for the toolchain to be able to optimise
    automatically than to require manual intervention by the programmer.
    (It should go without saying that optimisations are only valid if they
    do not affect the observable behaviour of correct code.)  Programmers
    are notoriously bad at figuring out what will affect their code
    efficiency, and will either under-use "restrict" where it could
    clearly be safely used to speed up code, or over-use it resulting in
    risky code.

    If the compiler can't be sure that accesses don't alias, then of
    course it should assume that aliasing is possible.

    The idea of pointer provenance is to let compilers (and programmers!)
    have a better understanding of when accesses are guaranteed to be
    alias- free, when they are guaranteed to be aliasing, and when there
    are no guarantees.  This is useful for optimisation and program
    analysis (including static error checking).  The more information the
    compiler has, the better.


    That is the idea at least.

    Though, if one assumes the compiler has non-local visibility, this is a problem.

    Granted, as long as one can keep using more traditional semantics,
    probably OK.

    Of course compilers can (and must!) fall back to the "assume accesses
    might alias" approach when they don't have the extra information. But
    at least for code in the same compilation, they can do better.

    And there is a trend amongst those wanting higher performance to use
    link-time optimisation, whole-program optimisation, or similarly named techniques to share information across units. Traditional separate
    compilation to object files then linking by identifier name only is a
    nice clear model, but hugely limiting for both optimisation and static
    error checking.




    In my compiler, the default was to use a fairly conservative aliasing
    strategy.

    ...
    With pointer operations, all stores can be assumed potentially
    aliasing unless restrict is used, regardless of type.


    C does not require that.  And it is rare in practice, IME, for code to
    actually need to access the same data through different lvalue types
    (other than unsigned char).  It is rarer still for it not to be
    handled better using type-punning unions or memcpy() - assuming the
    compiler handles memcpy() decently.


    I take a conservative approach because I want the compiler to be able to
    run code that assumes traditional behavior (like that typical of 1990s
    era compilers, or MSVC).

    Please don't call this "traditional behaviour" of compilers - be honest,
    and call it limited optimisation and dumb translation. And don't call
    it "code that assumes traditional behaviour" - call it "code written by
    people who don't really understand the language". Code which assumes
    you can do "extern float x; unsigned int * p = (unsigned int *) &x;" is
    broken code. It always has been, and always will be - even if it does
    what the programmer wanted on old or limited compilers.

    There were compilers in the 1990's that did type-based alias analysis,
    and many other "modern" optimisations - I have used at least one.

    It's okay to be conservative in a compiler (especially when high
    optimisation is really difficult!). It's okay to have command-line
    switches or pragmas to support additional language semantics such as
    supporting access via any lvalue type, or giving signed integer
    arithmetic two's complement wrapping behaviour. It's okay to make these
    the defaults.

    But it is not okay to encourage code to make these compiler-specific assumptions without things like a pre-processor check for the specific
    compiler and pragmas to explicitly set the required compiler switches.
    It is not okay to excuse bad code as "traditional style" - that's an
    insult to people who have been writing good C code for decades.



    Granted, it is a tradeoff that a lot of this code needs to be modified
    to work on GCC and Clang (absent the usual need for "-fwrapv -fno-strict-aliasing" options).

    Granted, there is a command-line option to enable TBAA semantics, just
    it is not the default option in this case (so, in BGBCC, TBAA is opt-in; rather than opt-out in GCC and Clang).

    BGBCC's handling of memcpy is intermediate:
    It can turn it into loads and stores;
    But, it can't turn it into a plain register move;
    Taking the address of a variable will also cause the variable to be loaded/stored every time it is accessed in this function (regardless of
    where it is accessed in said function).

    So:
      memcpy(&i, &f, 8);
    Will still use memory ops and wreck the performance of both the i and f variables.

    Well, there you have scope for some useful optimisations (more useful
    than type-based alias analysis). memcpy does not need to use memory
    accesses unless real memory accesses are actually needed to give the
    observable effects specified in the C standards.

    unsigned int f_to_u(float f) {
    unsigned int u;
    memcpy(&u, &f, sizeof(f));
    return u;
    }

    gcc compiles that to :

    f_to_u:
    movd eax, xmm0
    ret

    Meanwhile:
      i=*(uitn64_t *)(&f);
    Will only wreck the performance of 'f'.


    The best option for performance in BGBCC is one of either:
      i=__float64_getbits(f);  //compiler intrinsic
      i=(__m64)f;              //__m64 and __m128 do a raw-bits cast.

    Though, these options don't exist in the other compilers.

    Such compiler extensions can definitely be useful, but it's even better
    if a compiler can optimise standard code - that way, programmers can
    write code that works correctly on any compiler and is efficient on the compilers that they are most interested in.


    Implicitly, casting via __m64 or __m128 is a double-cast though. In
    BGBCC, these types don't natively support any operators (so, they are basically sort of like the value-equivalents of "void *").


    So:
      memcpy(&i, &f, 8);      //best for GCC and Clang
      i=*(uitn64_t *)(&f);   //best for MSVC, error-prone in GCC
      i=(__m64)f;             //best for BGBCC, N/A for MSVC or GCC

    In a lot of cases, these end up with wrappers.

    GCC:
      static inline uitn64_t getU64(void *ptr)
      {
        uitn64_t v;
        memcpy(&v, ptr, 8);
        return(v);
      }
    MSVC or BGBCC:
      #define getU64(ptr)  (*((volatile uint64_t *)(ptr)))

    Though, have noted that volatile usually works in GCC as well, though in
    GCC there is no obvious performance difference between volatile and
    memcpy, whereas in MSVC the use of a volatile cast is faster.

    In gcc, a memcpy here will need to use a single memory read unless
    "getU64" is called with the address of a variable that is already in a
    register (in which case you get a single register move instruction). A volatile read will also do a single memory read - but it might hinder
    other optimisations by limiting the movement of code around.

    On MSVC, last I saw (which is a long time ago), any use of "memcpy" will
    be done using an external library function (in an DLL) for generic
    memcpy() use - clearly that will have /massive/ overhead in comparison
    to the single memory read needed for a volatile access.



    Don't want to use static inline functions in BGBCC though, as it still doesn't support inline functions in the general case.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to BGB on Fri Jul 11 10:48:23 2025
    On 11/07/2025 04:09, BGB wrote:
    On 7/10/2025 4:34 AM, David Brown wrote:
    On 10/07/2025 04:28, BGB wrote:
    On 7/9/2025 4:41 AM, David Brown wrote:
    On 09/07/2025 04:39, BGB wrote:
    On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
    On 2025-07-02, Alexis <[email protected]> wrote:

    ...


    Please don't call this "traditional behaviour" of compilers - be
    honest, and call it limited optimisation and dumb translation.  And
    don't call it "code that assumes traditional behaviour" - call it
    "code written by people who don't really understand the language".
    Code which assumes you can do "extern float x; unsigned int * p =
    (unsigned int *) &x;" is broken code.  It always has been, and always
    will be - even if it does what the programmer wanted on old or limited
    compilers.

    There were compilers in the 1990's that did type-based alias analysis,
    and many other "modern" optimisations - I have used at least one.


    Either way, MSVC mostly accepts this sorta code.

    I remember reading in a MSVC blog somewhere that they had no plans to
    introduce type-based alias analysis in the compiler. The same blog
    article announced their advanced new optimisations that treat signed
    integer overflow as undefined behaviour and explained that they'd being
    doing that for years in a few specific cases. I think it is fair to
    assume there is a strong overlap between the programmers who think MSVC,
    or C and C++ in general, have two's complement wrapping of signed
    integers when the hardware supports it, as those who think pointer casts
    let you access any data.

    And despite the blog, I don't believe MSVC will be restricted that way indefinitely. After all, they encourage the use of clang/llvm for C programming, and that does do type-based alias analysis and optimisation.

    The C world is littered with code that "used to work" or "works when optimisation is not used" because it relied on shite like this -
    unwarranted assumptions about limitations in compiler technology.


    Also I think a lot of this code was originally written for compilers
    like Watcom C and similar.


    Have noted that there are some behavioral inconsistencies, for example:
    Some old code seems to assumes that x<<y, y always shifts left but
    modulo to the width of the type. Except, when both x and y are constant,
    code seems to expect it as if it were calculated with a wider type, and
    where negative shifts go in the opposite direction, ... with the result
    then being converted to the final type.

    Meanwhile, IIRC, GCC and Clang raise an error if trying to do a large or negative shift. MSVC will warn if the shift is large or negative.

    Though, in most cases, if the shift is larger than the width of the
    type, or negative, it is usually a programming error.


    It's okay to be conservative in a compiler (especially when high
    optimisation is really difficult!).  It's okay to have command-line
    switches or pragmas to support additional language semantics such as
    supporting access via any lvalue type, or giving signed integer
    arithmetic two's complement wrapping behaviour.  It's okay to make
    these the defaults.

    But it is not okay to encourage code to make these compiler-specific
    assumptions without things like a pre-processor check for the specific
    compiler and pragmas to explicitly set the required compiler switches.
    It is not okay to excuse bad code as "traditional style" - that's an
    insult to people who have been writing good C code for decades.


    A lot of the code I have seen from the 90s was written this way.


    Yes. A lot code from the 90's was written badly. A lot of code today
    is written badly. Just because a lot of code was, and still is, written
    that way does not stop it being bad code.


    Though, a lot of it comes from a few major sources:
      id Software;
        Can mostly be considered "standard" practice,
        along with maybe Linux kernel, ...
      Apogee Software
        Well, some of this code is kinda bad.
        This code tends to be dominated by global variables.
        Also treating array bounds as merely a suggestion.
      Raven Software
        Though, most of this was merely modified ID Software code.

    Early on, I think I also looked a fair bit at the Linux kernel, and also
    some of the GNU shell utilities and similar (though, the "style" was
    very different vs either the Linux kernel or ID code).


    The Linux kernel is not a C style to aspire to. But they do at least
    try to make such assumptions explicit - the kernel build process makes
    it very clear that it requires the "-fno-strict-aliasing" flag and can
    only be correctly compiled by a specific range of gcc versions (and I
    think experimentally, icc and clang). Low-level and systems programming
    is sometimes very dependent on the details of the targets, or the
    details of particular compilers - that's okay, as long as it is clear in
    the code and the build instructions. Then the code (or part of it at
    least) is not written in standard C, but in gcc-specific C or some other non-standard dialect. It is not, however, "traditional C".



    Early on, I had learned C partly by tinkering around with id's code and trying to understand what secrets it contained.


    But, alas, an example from Wikipedia shows a relevant aspect of id's style: https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code

    Which is, at least to me, what I consider "traditional".

    The declaration of all the variables at the top of the function is "traditional". The reliance on a specific format for floating point is system-dependent code (albeit one that works on a great many systems).
    The use of "long" for a 32-bit integer is both "traditional" /and/ system-dependent. (Though it is possible that earlier in the code there
    are pre-processor checks on the size of "long".) The use of signed
    integer types for bit manipulation is somewhere between "traditional"
    and "wrong". The use of pointer casts instead of a type-punning union
    is wrong. The lack of documentation and comments, use of an unexplained
    magic number, and failure to document or comment the range for which the algorithm works and its accuracy limitations are also very traditional -
    a programming tradition that remains strong today.

    It is worth remembering that game code (especially commercial game code)
    is seldom written with a view to portability, standard correctness, or
    future maintainability. It is written to be as fast as possible using
    the compiler chosen at the time, to be build and released as a binary in
    the shortest possible time-to-market.


    So:
       memcpy(&i, &f, 8);
    Will still use memory ops and wreck the performance of both the i and
    f variables.

    Well, there you have scope for some useful optimisations (more useful
    than type-based alias analysis).  memcpy does not need to use memory
    accesses unless real memory accesses are actually needed to give the
    observable effects specified in the C standards.


    Possibly, but by the stage we know that it could be turned into a
    reg-reg move (in the final code generation), most of the damage has
    already been done.

    Basically, it would likely be necessary to detect and special case this scenario at the AST level(probably by turning it into a cast or
    intrinsic). But, usually one doesn't want to add too much of this sort
    of cruft to the AST walk.


    One thing to remember is that functions like "memcpy" don't have to be
    treated as normal functions. You can handle it as a keyword in your
    compiler if that's easiest. You can declare it as a macro in your
    <strings.h>. You can combine these, and have compiler-specific
    extensions (keywords, attributes, whatever) and have the declaration as
    a function with attributes. Your key aim is to spot cases where there
    is a small compile-time constant on the size of the memcpy.


    But, then, apart from code written to assume GCC or similar, most of the
    code doesn't use memcpy in this way.

    So, it would mostly only bring significant advantage if pulling code in
    from GCC land.

    How well do you handle type-punning unions? Do they need to be moved
    out to the stack, or can they be handled in registers?


    unsigned int f_to_u(float f) {
         unsigned int u;
         memcpy(&u, &f, sizeof(f));
         return u;
    }

    gcc compiles that to :

    f_to_u:
         movd eax, xmm0
         ret


    Yeah, it is more clever here, granted.

    Meanwhile:
       i=*(uitn64_t *)(&f);
    Will only wreck the performance of 'f'.


    The best option for performance in BGBCC is one of either:
       i=__float64_getbits(f);  //compiler intrinsic
       i=(__m64)f;              //__m64 and __m128 do a raw-bits cast.

    Though, these options don't exist in the other compilers.

    Such compiler extensions can definitely be useful, but it's even
    better if a compiler can optimise standard code - that way,
    programmers can write code that works correctly on any compiler and is
    efficient on the compilers that they are most interested in.


    Possibly.

    For "semi-portable" code, usually used MSVC style, partly as by adding 'volatile' it seemingly also works in GCC. Though, often with macro
    wrappers.

    Code that has to be widely portable, with an aim to being efficient on
    many compilers and correct on all, always ends up with macro wrappers
    for this kind of thing, defined conditionally according to compiler
    detection.




    Implicitly, casting via __m64 or __m128 is a double-cast though. In
    BGBCC, these types don't natively support any operators (so, they are
    basically sort of like the value-equivalents of "void *").


    So:
       memcpy(&i, &f, 8);      //best for GCC and Clang
       i=*(uitn64_t *)(&f);   //best for MSVC, error-prone in GCC
       i=(__m64)f;             //best for BGBCC, N/A for MSVC or GCC

    In a lot of cases, these end up with wrappers.

    GCC:
       static inline uitn64_t getU64(void *ptr)
       {
         uitn64_t v;
         memcpy(&v, ptr, 8);
         return(v);
       }
    MSVC or BGBCC:
       #define getU64(ptr)  (*((volatile uint64_t *)(ptr)))

    Though, have noted that volatile usually works in GCC as well, though
    in GCC there is no obvious performance difference between volatile
    and memcpy, whereas in MSVC the use of a volatile cast is faster.

    In gcc, a memcpy here will need to use a single memory read unless
    "getU64" is called with the address of a variable that is already in a
    register (in which case you get a single register move instruction).
    A volatile read will also do a single memory read - but it might
    hinder other optimisations by limiting the movement of code around.


    Possibly.

    When I tried benchmarking these before:
      GCC:
        Seemingly no difference between memcpy and volatile;

    As I explained, that is to be expected in cases where the you can't get
    other optimisations that "volatile" would block. Usually simple timing benchmarks have fewer optimisation opportunities than real code.

      MSVC:
        Adding or removing volatile made no real difference;

    That will, of course, depend on the benchmark. A volatile access will
    not normally take more time than a non-volatile access. But
    non-volatile accesses can be re-ordered, combined, or omitted in ways
    that volatile accesses cannot.

        Using memcpy is slower.

    As I explained.

      BGBCC: Either memcpy or volatile carries an overhead.
        The use of volatile is basically a shotgun de-optimization;
        If doesn't know what to de-optimize, so goes naive for everything.


    Okay.


    On MSVC, last I saw (which is a long time ago), any use of "memcpy"
    will be done using an external library function (in an DLL) for
    generic memcpy() use - clearly that will have /massive/ overhead in
    comparison to the single memory read needed for a volatile access.


    It is slightly more clever now, but still not great.
      Will not (always) generate a library call.
      Though, in VS2008 or similar, was always still a library call.
        VS2010 and VS2013 IIRC might setup and use "REP MOVSB" instead.

    It will do it inline, but still often:
      Spill variables;
      Load addresses;
      Load from source;
      Store to destination;
      Load value from destination.

    What BGBCC gives here is basically similar.




    Don't want to use static inline functions in BGBCC though, as it
    still doesn't support inline functions in the general case.




    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to David Brown on Sun Jul 20 00:21:41 2025
    David Brown <[email protected]> wrote:
    On 09/07/2025 04:39, BGB wrote:
    On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
    On 2025-07-02, Alexis <[email protected]> wrote:

    ...


    I don't have confidence in an author's understanding of C, if they
    believe that ISO C defines the behavior of invalid pointers being
    compared, such that this needs to be rectified by a private "patch"
    of the text.


    You might not be aware of it, but the author Jens Gustedt is a member of
    the C standards committee, and has been for some time. He is the most
    vocal, public and active member. I think that suggests he has quite a
    good understanding of C and the ISO standards! Not everyone agrees
    about his ideas and suggestions about how to move C forward - but that's
    fine (and it's fine by Jens, from what I have read). That's why there
    is a standards committee, with voting, rather than a BDFL.

    The concept of pointer provenance can be expressed other than
    as a textual patch against ISO C.


    There have been plenty of papers and blogs written about pointer
    provenance (several by Gustedt) and how it could work. It's not a very
    easy thing to follow in any format. A patch to current C standards is perhaps the least easy to follow, but it is important for how the
    concept could be added to C.

    I looked at the blog post. About two thirds of it is explaing what
    I consider obvious. Later he makes some assumptions/rules and
    claims that they cover segmented model. But assumption:

    : Two pointer values are equal if they correspond to the same
    : abstract address.

    is problematic for 8086 segmentation (would force "huge" style
    pointer comparison). It is probably unworkage for more abstract
    segmentation (like in 286) when there are overlapping segments

    He spends time talking about XOR trick, but leaves different
    (and IMO much more important trick in undefined teritory).
    Namely, modern ARM and RISC-V embedded processors are 32-bit,
    so need 32-bit pointers. But low end processor frequently
    have tiny RAM that could be addressed using 16 bits. More
    precisely, one can use base pointer initialized to address
    of start of RAM and access memory location using 16 bit offset
    from the start of RAM. AFAICS definitions in the blog post
    put this strictly into undefined territory, but I expect this
    to work as indended in gcc.

    Later he writes about exposure and synthetised pointers.
    That is rather natural, but I did not found explicit
    statement how exposure and synthetised pointers are
    related to aliasing. Maybe the intent is like:
    "access via synthetised pointer may alias access to
    any exposed storage instance". OTOH in cases like
    convertion to offset with respect to some base and
    back we deal with synthetised pointers, but in principle
    compier could track bases and offsets and came with
    quite good alias analysis.

    More generally, the blog post looks like very preliminary
    analysis that compiler should do before further work on
    alias analysis. But compiler writer presumably knows
    the targert, so can make assumption that better fit
    actial situation than assumptions made in the blog post.

    So, ATM it is not clear to me that puting such things in the
    standard adds value. It could if standard formulated new
    aliasing rules, but I see no new aliasing rule in the blog
    post. And IMO new rules should be related to algorithms:
    without good algorithms rules must be either conservative
    (disallowing optimizations) or risk breaking code.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Waldek Hebisch on Sun Jul 20 05:09:02 2025
    On 20/07/2025 02:21, Waldek Hebisch wrote:
    David Brown <[email protected]> wrote:
    On 09/07/2025 04:39, BGB wrote:
    On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
    On 2025-07-02, Alexis <[email protected]> wrote:

    ...


    I don't have confidence in an author's understanding of C, if they
    believe that ISO C defines the behavior of invalid pointers being
    compared, such that this needs to be rectified by a private "patch"
    of the text.


    You might not be aware of it, but the author Jens Gustedt is a member of
    the C standards committee, and has been for some time. He is the most
    vocal, public and active member. I think that suggests he has quite a
    good understanding of C and the ISO standards! Not everyone agrees
    about his ideas and suggestions about how to move C forward - but that's
    fine (and it's fine by Jens, from what I have read). That's why there
    is a standards committee, with voting, rather than a BDFL.

    The concept of pointer provenance can be expressed other than
    as a textual patch against ISO C.


    There have been plenty of papers and blogs written about pointer
    provenance (several by Gustedt) and how it could work. It's not a very
    easy thing to follow in any format. A patch to current C standards is
    perhaps the least easy to follow, but it is important for how the
    concept could be added to C.

    I looked at the blog post. About two thirds of it is explaing what
    I consider obvious. Later he makes some assumptions/rules and
    claims that they cover segmented model. But assumption:

    : Two pointer values are equal if they correspond to the same
    : abstract address.

    is problematic for 8086 segmentation (would force "huge" style
    pointer comparison). It is probably unworkage for more abstract
    segmentation (like in 286) when there are overlapping segments


    Segmented memory models are ancient history, and basically irrelevant to
    new ideas and new C standard versions. Compilers for these models, if
    anyone ever makes new versions, can continue to use the old memory
    models. The provenance memory model is about making more accurate
    analysis for optimisation and static checking - if a compiler can't use
    it, that's okay.

    He spends time talking about XOR trick, but leaves different
    (and IMO much more important trick in undefined teritory).
    Namely, modern ARM and RISC-V embedded processors are 32-bit,
    so need 32-bit pointers. But low end processor frequently
    have tiny RAM that could be addressed using 16 bits. More
    precisely, one can use base pointer initialized to address
    of start of RAM and access memory location using 16 bit offset
    from the start of RAM. AFAICS definitions in the blog post
    put this strictly into undefined territory, but I expect this
    to work as indended in gcc.

    That's not just for low-end microcontrollers. On PowerPC and POWER, it
    is normal to have a register for the "small data segment" pointer -
    small statically allocated data is placed in a 64 KB segment and
    addressed using the base register plus a 16-bit offset.

    More generally, it is perfectly normal for the same data to be accessed
    in different ways - absolute addresses, direct pointer registers,
    pointers to a struct and then with constant offset, and so on. The
    model and its implementations have to deal with that, or they will be
    useless on all targets.


    Later he writes about exposure and synthetised pointers.
    That is rather natural, but I did not found explicit
    statement how exposure and synthetised pointers are
    related to aliasing. Maybe the intent is like:
    "access via synthetised pointer may alias access to
    any exposed storage instance". OTOH in cases like
    convertion to offset with respect to some base and
    back we deal with synthetised pointers, but in principle
    compier could track bases and offsets and came with
    quite good alias analysis.

    More generally, the blog post looks like very preliminary
    analysis that compiler should do before further work on
    alias analysis. But compiler writer presumably knows
    the targert, so can make assumption that better fit
    actial situation than assumptions made in the blog post.

    So, ATM it is not clear to me that puting such things in the
    standard adds value. It could if standard formulated new
    aliasing rules, but I see no new aliasing rule in the blog
    post. And IMO new rules should be related to algorithms:
    without good algorithms rules must be either conservative
    (disallowing optimizations) or risk breaking code.


    I think it will be a while before all this is ready for the standard, or
    for implementation.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)