Timothy Chow <
[email protected]> writes:
If you found these results surprising then you have a long way to go
before you achieve any understanding of this topic. Your results are completely in line with what the conventional wisdom would predict.
I tend to agree, and I can offer more of my own results that might
give Murat food for thought. I had completed my experiments two months
ago, but did not find time to write up a post. But now it seems like
the perfect moment.
Let me introduce the main characters of my fictitious chouette. All
are expert checker players, but have very poor understanding of cube
skill. They are, however, able to assess who is favourite (> 50 %
winning chances).
- Clarence Careful: He will never double and only take if favourite.
- Danny Dropper: He will always double if favourite, but only take if
favourite.
- Toby Taker: The opposite of Danny, he will never double, but always
take.
- William Wildcube: He will always double if favourite and always take.
The chouette host of these mutants is Edward Equity, who is on expert
level with respect to checker play and has world class cube
handling. In Edward's chouette, the Jacoby rule is used, but Beavers
are forbidden. There is no consulting. The session lasts 3000
games. And of course there was no chouette, that is just the
background story. Instead, I had my simple bot mimic gnubg's checker
play and use one of the 4 mutant cubing strategies in turn against
gnubg set to Expert checker play and World Class cube handling.
The null hypothesis was that the respective mutant's cube strategy is
as good as the world class cube handling by Edward/gnubg. For all four
mutants it could be rejected with a sigma level > 4.5. Surprise,
surprise. (-;
After that was done, I did further experiments, generalizing the
mutant's "strategies". Let me introduce Rory Random:
- Never doubles if losing (non-favourite)
- If favourite, doubles with probability d/6 (d from 0 to 6)
- Always takes if winning (favourite)
- If losing (non-favourite), takes with probability t/6 (t from 0 to 6)
So Clarence Careful is a special case of Rory Random, namely d = 0 and
t = 0. Likewise, Danny Dropper uses d = 6 and t = 0, Toby Taker uses d
= 0 and t = 6, and William Wildcube uses d = 6 and t = 6 as a
"strategy".
Overall in this framework, there are 49 mutant strategies, some wild,
some not so wild. It should be clear that the wilder mutants drive the
cube up and thus the average game will have more points at stake than,
say, a session with Clarence Careful. Hence it makes sense to relate
the average loss of the mutant strategies not to the number of games,
but rather to the number of points.
For example, Edward Equity (= gnubg) versus William Wildcube ended
11392 versus 8066 after 3000 games, a net win for gnubg of 3326. This
amounts to more than 1.1 points per game, but a more meaningful number
is (11392-8066)/(11392+8066) = 0.17 points won per points played
(pwppp).
Here are these results for all the mutants I tested (the former 4
chouette characters are the "corner cases"). All could be dismissed with
a sigma level > 2.9.
| pwppp | Random take | |---------------+-------------+------+------+------+------+------+------|
| Random double | 0/6 | 1/6 | 2/6 | 3/6 | 4/6 | 5/6 | 6/6 | |---------------+-------------+------+------+------+------+------+------|
| 0/6 | 0.38 | | | 0.27 | | | 0.25 |
| 1/6 | | 0.15 | | | | 0.16 | |
| 2/6 | | | 0.13 | | 0.14 | | |
| 3/6 | 0.15 | | | 0.14 | | | 0.13 |
| 4/6 | | | 0.17 | | 0.17 | | |
| 5/6 | | 0.16 | | | | 0.16 | |
| 6/6 | 0.15 | | | 0.14 | | | 0.17 |
The timid non-doubling strategies (first row with "0/6" doubling
probability, d = 0) fare much worse than the mutants who dare to
double at all (d > 0). These latter ones all get roughly similar
results, around 0.15 pwppp. Now before one falsely believes that this
shows that cube strategies do not matter and switches to random doubling/taking, you should realize that 0.15 pwppp is not "quite an achievement", but it is pretty bad. Here is a table of three different
players (set up by using numerical noise in gnubg for checker play and
cube handling) all achieving roughly the same pwppp of 0.15:
| Checker noise | Cube noise | Checker rating | Cube rating | Overall | |---------------+------------+----------------+--------------+--------------|
| 0.000 | ? (Mutant) | Expert | Awful | Intermediate |
| 0.016 | 0.016 | Advanced | Beginner | Intermediate |
| 0.022 | 0.000 | Intermediate | Supernatural | Intermediate |
In my opinion the performance of the mutant cubers is not surprisingly
good, but rather expected and not something to be proud of. Any
ambitious backgammon player should strive for more than "Advanced" and "Beginner" as ratings (the row with 0.016 noise), which is about as
good as the mutants.
So the bottom line is: Much ado about nothing. An interesting study
for me nevertheless, and it might be fun in your next live session to
roll a dice before a cube decision. The face of your opponent might be
worth the price you will be paying for that stunt. Which, by the way,
has a prominent precedent in Phil Simborg:
https://www.bkgm.com/articles/Simborg/ACoinToss/
There were two things that puzzled me a bit, but I will address them
in a different post.
Best regards
Axel
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)