MitchAlsup wrote:
BGB <[email protected]> posted:
Well, idea here is that sometimes one wants to be able to do
floating-point math where accuracy is a very low priority.
Say, the sort of stuff people might use FP8 or BF16 or maybe Binary16
for (though, what I am thinking of here is low-precision even by
Binary16 standards).
For 8-bit stuff, just use 5 memory tables [256×256]
They don't even need to be full 8-bit: With a tiny amount of logic to
handle the signs you are already down to 128x128, right?
Then again, probably other people know about all of this and might know
what I am missing.
The infamous invsqrt() trick is the canonical example of where all the
quirks of the ieee 754 format works just right to get you to 10+ bits
with a single NR iteration.
Your basic ops examples are a lot more iffy.
I still recommend getting the right answer over getting a close but wrong answer a couple cycles earlier.
Exactly.
I think you showed me the idea of usually getting the correct result in
N cycles, but in a low number of cases, the trailing bits would be too
close to a rounding boundary, so they would add one more NR iteration.
I just realized that the code I wrote to fix Pentium FDIV could have
been even more efficient on a proper superscalar OoO CPU:
Start the FDIV immediately, then at the same time do the divisor
mantissa inspection to determine if the workaround would be needed (5
out of 1024 cases), and only if that happens, start the slower path that
takes up to twice as long.
The idea is that for 99.5% of all divisors, the only cost would be a
close to zero cycle correctly predicted branch, but then the remainder
would require two FDIV operations, so 80 instead of 40 cycles.
OTOH, that same Big OoO core can probably predict that the entire
mantissa inspection part will end up with a "skip the workaround" branch
and start the FDIV almost at once. I'm assuming that when the mispredict
turns up, the core can stop a long operation like FDIV more or less immediately and discard the current status.
(From memory)
double fdiv(double a, double b)
{
uint64_t mant10;
memcpy(&mant10, &b, sizeof(ub));
mant10 = (mant10 >> 42) & 1023;
if (fdiv_table[mant10 >> 3] & (1 << (mant10 & 7))) {
// set fpu to extended/long double, save previous mode
b *= 15.0/16.0; // Exact operation!
a *= 15.0/16.0; // Exact operation!
// Restore to previous precision mode
}
return a / b;
}
Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)