On 12/16/2021 7:44 AM,
[email protected] wrote:
I missed this episode of the Stick-and-Tim show. (Maybe I'll treat myself to the Netflix box set for Christmas.) Is it really the case that XG n-ply doesn't always
make the play with the highest XG n-ply evaluation. Could you possibly explain this
XGPly Paradox please?
First, as a bit of background, there are two main ways you'll
encounter XG 3-ply playing. First, if you want to play a match
against the bot and you select "eXtremeGammon" as your opponent,
then the bot will play at the XG 3-ply level. Second, if you
perform a rollout, the default setting is 3-ply/XGR, which means
that during a rollout trial, the bot will play the checkers at
the 3-ply level and the cube at the XGR level.
But what exactly does it mean to play the checkers at the 3-ply
level? What the bot first does is to evaluate every legal play
at the 1-ply level, by directly consulting the neural net. Then
it will select some subset of these to evaluate at the 2-ply
level, and finally it will select some subset of *those* to
evaluate at the 3-ply level. In particular, for most moves, it
will *not* actually evaluate every legal play at the 3-ply level.
It first filters out a bunch of moves whose 1-ply evaluation is
too low, and then it filters out more moves whose 2-ply evaluation
is too low. Finally, it evaluates the surviving candidates at the
3-ply level, picks the one with the highest equity, and plays it.
This procedure is usually quite sensible. Most of the time, if
a play's 1-ply evaluation is extremely far below that of the play
with the highest 1-ply evaluation, a 3-ply evaluation will not
overturn that verdict (it won't come up with the same equity
estimate as 1-ply does, but it will usually agree on which play
is better). So you save a lot of time by quickly filtering out
most of the moves immediately, and only investing further compute
time on the "promising" moves. (If you're worried that the bot
is filtering out moves too aggressively, then one thing you can
do, at least for rollouts, is to increase the size of the move
filter from "Normal" to "Large" or "Huge" or "Gigantic.")
You can see this visually if you set up a position with a lot of
legal plays, and then click on the 3-ply button near the lower
left corner of the screen. You'll see that next to each move is
an icon with 5 bars, and usually you'll see some plays at the
top with 3 bars filled in, followed by some plays with 2 bars
filled in, followed by yet more plays with only 1 bar filled in.
However, once in a while, a move that is filtered out at the 1-ply
or 2-ply level will, if forcibly evaluated at 3-ply, come out ahead
of the move that XG 3-ply plays.
Clear?
I have always said "XG 3-ply plays..." when I mean that that's
what XG 3-ply plays, either when it's playing at the eXtremeGammon
level against a human opponent, or when it's playing out a rollout
trial under the standard rollout settings. It's hard for me to
imagine a more natural meaning for "XG 3-ply plays." However, Stick
insists that the natural interpretation of "XG 3-ply plays..." is
*not* what XG 3-ply plays, but rather what move has the highest
XG 3-ply evaluation. To me, the natural way to refer to the move
with the highest XG 3-ply evaluation is "the move with the highest
XG 3-ply evaluation." But Stick continues to insist that I'm using
language in a very confusing manner.
---
Tim Chow
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)