[email protected] (MitchAlsup) writes:
OoO costs roughly 3× In Order power and provides 1.4× performance (hand >waving accuracy).
Fortunately, we have measurement data, so we do not need to rely on
handwaving:
<
https://images.anandtech.com/doci/14072/Exynos9820-Perf-Estimated_575px.png> <
https://images.anandtech.com/doci/14072/Exynos9820-Perf-Eff-Estimated.png>
from the article
<
https://www.anandtech.com/show/14072/the-samsung-galaxy-s10plus-review/4>
In the Exynos 9820, we see at different points of the DVFS curve:
A55 | A75
in-order | OoO
perf mW pf/mW | perf mW pf/mW
1.0 22 0.046 | 3.7 88 0.042 highest efficiency point for each core
1.4 33 0.042 | 3.7 88 0.042 same pf/mW at highest common efficiency
2.7 90 0.030 | 3.7 88 0.042 same mW at lowest common mW
5.1 400 0.013 | 5.1 124 0.041 same perf at highest common performance
5.1 400 0.013 | 10.5 400 0.027 same mW at highest common mW
5.1 400 0.013 | 17.2 1270 0.013 highest performance point for each core
"prf" is SPEC2006 Int+FP Geomean. "pf/mW" (shown as "Perf/W" in the
second graph) is SPEC Int+FP Geomean/mW (you can confirm this by
computing corresponding numbers from the first graph).
So, at the highest efficiency point for each core, the OoO A75
consumes 4 times the power and delivers 3.7 times the performance of
the A55. As soon as you need a little more performance, the
efficiency of the A55 drops to the same level as the A75 (e.g., 2.6
times the performance at 2.6 times the power), but up until the A55
reaches the lowest power consumption of the A75 at 88mW, the A55 still
fills a niche; at that power consumption, tha A75 delivers 1.4 times
the performance of the A55. There is no reason to use the A55 beyond
this point if an A75 is free. And beyond 170mW, even the Exynos M4
outcompetes the A55 in every respect.
If there are more threads than A75 and M4 cores, it's an interesting
question if it is beneficial for power consumption to shift some of
the work to A55 cores and run the A75 and M4 at a correspondingly
better efficiency point. As long as the perf/W on the A55 is not
worse than the original perf/W on the other cores, that should help
(at least if the threads don't have to talk to each other too much),
but given the low performance of the A55, especially where it is
efficient, it won't help much.
- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <
[email protected]>
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)