"This work has finally resulted in the publication of an international standard, Technical Specification ISO/IEC TS 6010 (edited by Henry
Kleynhans, Bloomberg, UK) ...
"In this article, I will try to explain what this is all about, namely
on how a provenance model for pointers interferes with alias analysis of modern compilers.
For those that are not fluent with the terminology or
the concept we have a short intro what pointer aliasing is all about, a review of existing tools to help the compiler and inherent difficulties
and then the proposed model itself. At the end there is a brief takeaway
that explains how to generally avoid complications and loss of
optimization opportunities that could result from mis-guided aliasing analysis."
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:...
On 2025-07-02, Alexis <[email protected]> wrote:
I don't have confidence in an author's understanding of C, if they
believe that ISO C defines the behavior of invalid pointers being
compared, such that this needs to be rectified by a private "patch"
of the text.
The concept of pointer provenance can be expressed other than
as a textual patch against ISO C.
It can be regarded as a language extension and documented similarly
to how a sane compiler documentor would do it.
"In this article, I will try to explain what this is all about, namely
on how a provenance model for pointers interferes with alias analysis of >>> modern compilers.
Well, no shit; provenance is often dynamic; whereas aliasing analysis
wants to be static.
For those that are not fluent with the terminology or
the concept we have a short intro what pointer aliasing is all about, a
review of existing tools to help the compiler and inherent difficulties
and then the proposed model itself. At the end there is a brief takeaway >>> that explains how to generally avoid complications and loss of
optimization opportunities that could result from mis-guided aliasing
analysis."
If you think that certain code could go faster because certain suspected
aliasing isn't actually taking place, then since C99 you were able to
spin the roulette wheel and use "restrict".
So the aliasing analysis and its missed opportunities are the
programmer's responsibility.
It's always better for the machine to miss opportunities than to miss
compile. :)
Agreed.
In my compiler, the default was to use a fairly conservative aliasing strategy.
With pointer operations, all stores can be assumed potentially aliasing unless restrict is used, regardless of type.
On 7/9/2025 4:41 AM, David Brown wrote:
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:...
On 2025-07-02, Alexis <[email protected]> wrote:
There have been plenty of papers and blogs written about pointer
provenance (several by Gustedt) and how it could work. It's not a
very easy thing to follow in any format. A patch to current C
standards is perhaps the least easy to follow, but it is important for
how the concept could be added to C.
Admittedly, as of yet, I haven't quite figured out what exactly
provenance is supposed to be, or how it is supposed to work in practice.
If you think that certain code could go faster because certain
suspected
aliasing isn't actually taking place, then since C99 you were able to
spin the roulette wheel and use "restrict".
"restrict" can certainly be useful in some cases. There are also
dozens of compiler extensions (such as gcc attributes) for giving the
compiler extra information about aliasing.
And, the annoyance of them being compiler dependent...
So the aliasing analysis and its missed opportunities are the
programmer's responsibility.
It's always better for the machine to miss opportunities than to miss
compile. :)
Agreed.
It is always better for the toolchain to be able to optimise
automatically than to require manual intervention by the programmer.
(It should go without saying that optimisations are only valid if they
do not affect the observable behaviour of correct code.) Programmers
are notoriously bad at figuring out what will affect their code
efficiency, and will either under-use "restrict" where it could
clearly be safely used to speed up code, or over-use it resulting in
risky code.
If the compiler can't be sure that accesses don't alias, then of
course it should assume that aliasing is possible.
The idea of pointer provenance is to let compilers (and programmers!)
have a better understanding of when accesses are guaranteed to be
alias- free, when they are guaranteed to be aliasing, and when there
are no guarantees. This is useful for optimisation and program
analysis (including static error checking). The more information the
compiler has, the better.
That is the idea at least.
Though, if one assumes the compiler has non-local visibility, this is a problem.
Granted, as long as one can keep using more traditional semantics,
probably OK.
...
In my compiler, the default was to use a fairly conservative aliasing
strategy.
With pointer operations, all stores can be assumed potentially
aliasing unless restrict is used, regardless of type.
C does not require that. And it is rare in practice, IME, for code to
actually need to access the same data through different lvalue types
(other than unsigned char). It is rarer still for it not to be
handled better using type-punning unions or memcpy() - assuming the
compiler handles memcpy() decently.
I take a conservative approach because I want the compiler to be able to
run code that assumes traditional behavior (like that typical of 1990s
era compilers, or MSVC).
Granted, it is a tradeoff that a lot of this code needs to be modified
to work on GCC and Clang (absent the usual need for "-fwrapv -fno-strict-aliasing" options).
Granted, there is a command-line option to enable TBAA semantics, just
it is not the default option in this case (so, in BGBCC, TBAA is opt-in; rather than opt-out in GCC and Clang).
BGBCC's handling of memcpy is intermediate:
It can turn it into loads and stores;
But, it can't turn it into a plain register move;
Taking the address of a variable will also cause the variable to be loaded/stored every time it is accessed in this function (regardless of
where it is accessed in said function).
So:
memcpy(&i, &f, 8);
Will still use memory ops and wreck the performance of both the i and f variables.
Meanwhile:
i=*(uitn64_t *)(&f);
Will only wreck the performance of 'f'.
The best option for performance in BGBCC is one of either:
i=__float64_getbits(f); //compiler intrinsic
i=(__m64)f; //__m64 and __m128 do a raw-bits cast.
Though, these options don't exist in the other compilers.
Implicitly, casting via __m64 or __m128 is a double-cast though. In
BGBCC, these types don't natively support any operators (so, they are basically sort of like the value-equivalents of "void *").
So:
memcpy(&i, &f, 8); //best for GCC and Clang
i=*(uitn64_t *)(&f); //best for MSVC, error-prone in GCC
i=(__m64)f; //best for BGBCC, N/A for MSVC or GCC
In a lot of cases, these end up with wrappers.
GCC:
static inline uitn64_t getU64(void *ptr)
{
uitn64_t v;
memcpy(&v, ptr, 8);
return(v);
}
MSVC or BGBCC:
#define getU64(ptr) (*((volatile uint64_t *)(ptr)))
Though, have noted that volatile usually works in GCC as well, though in
GCC there is no obvious performance difference between volatile and
memcpy, whereas in MSVC the use of a volatile cast is faster.
Don't want to use static inline functions in BGBCC though, as it still doesn't support inline functions in the general case.
On 7/10/2025 4:34 AM, David Brown wrote:
On 10/07/2025 04:28, BGB wrote:
On 7/9/2025 4:41 AM, David Brown wrote:
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:...
On 2025-07-02, Alexis <[email protected]> wrote:
Please don't call this "traditional behaviour" of compilers - be
honest, and call it limited optimisation and dumb translation. And
don't call it "code that assumes traditional behaviour" - call it
"code written by people who don't really understand the language".
Code which assumes you can do "extern float x; unsigned int * p =
(unsigned int *) &x;" is broken code. It always has been, and always
will be - even if it does what the programmer wanted on old or limited
compilers.
There were compilers in the 1990's that did type-based alias analysis,
and many other "modern" optimisations - I have used at least one.
Either way, MSVC mostly accepts this sorta code.
Also I think a lot of this code was originally written for compilers
like Watcom C and similar.
Have noted that there are some behavioral inconsistencies, for example:
Some old code seems to assumes that x<<y, y always shifts left but
modulo to the width of the type. Except, when both x and y are constant,
code seems to expect it as if it were calculated with a wider type, and
where negative shifts go in the opposite direction, ... with the result
then being converted to the final type.
Meanwhile, IIRC, GCC and Clang raise an error if trying to do a large or negative shift. MSVC will warn if the shift is large or negative.
Though, in most cases, if the shift is larger than the width of the
type, or negative, it is usually a programming error.
It's okay to be conservative in a compiler (especially when high
optimisation is really difficult!). It's okay to have command-line
switches or pragmas to support additional language semantics such as
supporting access via any lvalue type, or giving signed integer
arithmetic two's complement wrapping behaviour. It's okay to make
these the defaults.
But it is not okay to encourage code to make these compiler-specific
assumptions without things like a pre-processor check for the specific
compiler and pragmas to explicitly set the required compiler switches.
It is not okay to excuse bad code as "traditional style" - that's an
insult to people who have been writing good C code for decades.
A lot of the code I have seen from the 90s was written this way.
Though, a lot of it comes from a few major sources:
id Software;
Can mostly be considered "standard" practice,
along with maybe Linux kernel, ...
Apogee Software
Well, some of this code is kinda bad.
This code tends to be dominated by global variables.
Also treating array bounds as merely a suggestion.
Raven Software
Though, most of this was merely modified ID Software code.
Early on, I think I also looked a fair bit at the Linux kernel, and also
some of the GNU shell utilities and similar (though, the "style" was
very different vs either the Linux kernel or ID code).
Early on, I had learned C partly by tinkering around with id's code and trying to understand what secrets it contained.
But, alas, an example from Wikipedia shows a relevant aspect of id's style: https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
Which is, at least to me, what I consider "traditional".
So:
memcpy(&i, &f, 8);
Will still use memory ops and wreck the performance of both the i and
f variables.
Well, there you have scope for some useful optimisations (more useful
than type-based alias analysis). memcpy does not need to use memory
accesses unless real memory accesses are actually needed to give the
observable effects specified in the C standards.
Possibly, but by the stage we know that it could be turned into a
reg-reg move (in the final code generation), most of the damage has
already been done.
Basically, it would likely be necessary to detect and special case this scenario at the AST level(probably by turning it into a cast or
intrinsic). But, usually one doesn't want to add too much of this sort
of cruft to the AST walk.
But, then, apart from code written to assume GCC or similar, most of the
code doesn't use memcpy in this way.
So, it would mostly only bring significant advantage if pulling code in
from GCC land.
unsigned int f_to_u(float f) {
unsigned int u;
memcpy(&u, &f, sizeof(f));
return u;
}
gcc compiles that to :
f_to_u:
movd eax, xmm0
ret
Yeah, it is more clever here, granted.
Meanwhile:
i=*(uitn64_t *)(&f);
Will only wreck the performance of 'f'.
The best option for performance in BGBCC is one of either:
i=__float64_getbits(f); //compiler intrinsic
i=(__m64)f; //__m64 and __m128 do a raw-bits cast.
Though, these options don't exist in the other compilers.
Such compiler extensions can definitely be useful, but it's even
better if a compiler can optimise standard code - that way,
programmers can write code that works correctly on any compiler and is
efficient on the compilers that they are most interested in.
Possibly.
For "semi-portable" code, usually used MSVC style, partly as by adding 'volatile' it seemingly also works in GCC. Though, often with macro
wrappers.
Implicitly, casting via __m64 or __m128 is a double-cast though. In
BGBCC, these types don't natively support any operators (so, they are
basically sort of like the value-equivalents of "void *").
So:
memcpy(&i, &f, 8); //best for GCC and Clang
i=*(uitn64_t *)(&f); //best for MSVC, error-prone in GCC
i=(__m64)f; //best for BGBCC, N/A for MSVC or GCC
In a lot of cases, these end up with wrappers.
GCC:
static inline uitn64_t getU64(void *ptr)
{
uitn64_t v;
memcpy(&v, ptr, 8);
return(v);
}
MSVC or BGBCC:
#define getU64(ptr) (*((volatile uint64_t *)(ptr)))
Though, have noted that volatile usually works in GCC as well, though
in GCC there is no obvious performance difference between volatile
and memcpy, whereas in MSVC the use of a volatile cast is faster.
In gcc, a memcpy here will need to use a single memory read unless
"getU64" is called with the address of a variable that is already in a
register (in which case you get a single register move instruction).
A volatile read will also do a single memory read - but it might
hinder other optimisations by limiting the movement of code around.
Possibly.
When I tried benchmarking these before:
GCC:
Seemingly no difference between memcpy and volatile;
MSVC:
Adding or removing volatile made no real difference;
Using memcpy is slower.
BGBCC: Either memcpy or volatile carries an overhead.
The use of volatile is basically a shotgun de-optimization;
If doesn't know what to de-optimize, so goes naive for everything.
On MSVC, last I saw (which is a long time ago), any use of "memcpy"
will be done using an external library function (in an DLL) for
generic memcpy() use - clearly that will have /massive/ overhead in
comparison to the single memory read needed for a volatile access.
It is slightly more clever now, but still not great.
Will not (always) generate a library call.
Though, in VS2008 or similar, was always still a library call.
VS2010 and VS2013 IIRC might setup and use "REP MOVSB" instead.
It will do it inline, but still often:
Spill variables;
Load addresses;
Load from source;
Store to destination;
Load value from destination.
What BGBCC gives here is basically similar.
Don't want to use static inline functions in BGBCC though, as it
still doesn't support inline functions in the general case.
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:...
On 2025-07-02, Alexis <[email protected]> wrote:
I don't have confidence in an author's understanding of C, if they
believe that ISO C defines the behavior of invalid pointers being
compared, such that this needs to be rectified by a private "patch"
of the text.
You might not be aware of it, but the author Jens Gustedt is a member of
the C standards committee, and has been for some time. He is the most
vocal, public and active member. I think that suggests he has quite a
good understanding of C and the ISO standards! Not everyone agrees
about his ideas and suggestions about how to move C forward - but that's
fine (and it's fine by Jens, from what I have read). That's why there
is a standards committee, with voting, rather than a BDFL.
The concept of pointer provenance can be expressed other than
as a textual patch against ISO C.
There have been plenty of papers and blogs written about pointer
provenance (several by Gustedt) and how it could work. It's not a very
easy thing to follow in any format. A patch to current C standards is perhaps the least easy to follow, but it is important for how the
concept could be added to C.
David Brown <[email protected]> wrote:
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:...
On 2025-07-02, Alexis <[email protected]> wrote:
I don't have confidence in an author's understanding of C, if they
believe that ISO C defines the behavior of invalid pointers being
compared, such that this needs to be rectified by a private "patch"
of the text.
You might not be aware of it, but the author Jens Gustedt is a member of
the C standards committee, and has been for some time. He is the most
vocal, public and active member. I think that suggests he has quite a
good understanding of C and the ISO standards! Not everyone agrees
about his ideas and suggestions about how to move C forward - but that's
fine (and it's fine by Jens, from what I have read). That's why there
is a standards committee, with voting, rather than a BDFL.
The concept of pointer provenance can be expressed other than
as a textual patch against ISO C.
There have been plenty of papers and blogs written about pointer
provenance (several by Gustedt) and how it could work. It's not a very
easy thing to follow in any format. A patch to current C standards is
perhaps the least easy to follow, but it is important for how the
concept could be added to C.
I looked at the blog post. About two thirds of it is explaing what
I consider obvious. Later he makes some assumptions/rules and
claims that they cover segmented model. But assumption:
: Two pointer values are equal if they correspond to the same
: abstract address.
is problematic for 8086 segmentation (would force "huge" style
pointer comparison). It is probably unworkage for more abstract
segmentation (like in 286) when there are overlapping segments
He spends time talking about XOR trick, but leaves different
(and IMO much more important trick in undefined teritory).
Namely, modern ARM and RISC-V embedded processors are 32-bit,
so need 32-bit pointers. But low end processor frequently
have tiny RAM that could be addressed using 16 bits. More
precisely, one can use base pointer initialized to address
of start of RAM and access memory location using 16 bit offset
from the start of RAM. AFAICS definitions in the blog post
put this strictly into undefined territory, but I expect this
to work as indended in gcc.
Later he writes about exposure and synthetised pointers.
That is rather natural, but I did not found explicit
statement how exposure and synthetised pointers are
related to aliasing. Maybe the intent is like:
"access via synthetised pointer may alias access to
any exposed storage instance". OTOH in cases like
convertion to offset with respect to some base and
back we deal with synthetised pointers, but in principle
compier could track bases and offsets and came with
quite good alias analysis.
More generally, the blog post looks like very preliminary
analysis that compiler should do before further work on
alias analysis. But compiler writer presumably knows
the targert, so can make assumption that better fit
actial situation than assumptions made in the blog post.
So, ATM it is not clear to me that puting such things in the
standard adds value. It could if standard formulated new
aliasing rules, but I see no new aliasing rule in the blog
post. And IMO new rules should be related to algorithms:
without good algorithms rules must be either conservative
(disallowing optimizations) or risk breaking code.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 27:37:07 |
| Calls: | 12,106 |
| Calls today: | 6 |
| Files: | 15,006 |
| Messages: | 6,518,216 |