-
Notifications
You must be signed in to change notification settings - Fork 1
Analysis
Status: You can use node extras/analysis.js
to run an automated analysis on as many slot/colour combinations as you want, to determine the minimum number of moves necessary to beat CheaterMind. However, CheaterMind does not necessarily recognize certain extremely clever moves. Furthermore, current output does not match the results of a published paper.
- Why does my algorithm differ from Knuth's on analysis of the game starting with 1122? (We agree that 1344 is next, narrowing to 256 possibilities, but then Knuth gives 3526 vs my algorithm giving 1525, even though they both end up having a worst case of 44 possibilities.) I suspect it's because I don't follow the rule of his, saying, "If this minimum can be achieved by a 'valid' pattern (a pattern that makes 'four black hits' possible), a valid one should be used." That would make sense, since 1525 is not one of the 44 possibilities remaining after 1344 is graded W.
- Why does my algorithm differ from the published paper linked to above? E.g. the result for 3 colours and 2 slots is given by my algorithm as 3, but as 4 by theirs. (If anything I would have expected a higher minimum if my algorithm were wrong.)
I created a script to analyze starting moves, to find out the minimum number of moves required to win against CheaterMind.
The starting moves have been reduced from 1296 to 5: the number of partitions of 4 having less than 6 (well, 4) different parts. Each part then becomes the count of a particular colour to use. Beyond this, every other starting move will look identical, except for a change of colours or ordering, but that won't change the underlying truth of how effective it is as a starting move against a worst-case scenario engine.
Comparing notes with Knuth's paper, I found that I had made a mistake somewhere, because my analysis was reporting that 1123 (in Knuth's parlance) was the optimal starting move (needing 5 moves), and all others (including Knuth's proclaimed champion, 1122) needed 6 moves.
I dug a bit and noticed that my algorithm started the same as his when analyzing 1122, but at the stage at which it would narrow down to 46 possibilities, there were many guesses that could produce that result. My algorithm was taking 3211 instead of 2234, because it was taking the first guess it came across with 46 possibilities, and my ordering was apparently different from Knuth's. So I modified it to take last guess instead, but that turned out to be 2344, still not Knuth's choice, and still taking 6 moves to win. It turns out this is because my ordering is not simply a reversal of Knuth's.
Rather than change my ordering completely, which I probably will eventually do, I decided to take a shortcut and make a value function based on Wikipedia's mention that "Knuth follows the convention of choosing the guess with the least numeric value e.g. 2345 is lower than 3456," which is also stated in the paper as, "the first such test pattern in numeric order was selected."
Hooray! My algorithm now reported that an opening move of 1122 did, in fact, allow someone to win within 5 moves. It's an open question, for me, as to why the mere convention followed here makes a difference. But it clearly does--I just don't understand it intuitively quite yet.
Anyway, this is where it got even more interesting. With my modified algorithm in place, I re-ran the analysis...which said that 1122 is a great opener, but equally great are 1123 and 1234. Knuth's paper specifically mentions these as not being able to guarantee 5-move wins, although he admits that "it is probably possible to fix [1123]" and only that "when the first test pattern is 1234 it appears impossible to guarantee a win in five." (Emphasis mine.) I believe my algorithm shows that it's indeed possible to fix 1123, and furthermore that appearances may have been misleading. It's hard to tell, though, because the paper does not go into quite enough detail how he arrived at these latter two analyses.
So, is it possible that Knuth's conclusions weren't complete, and that what my revised algorithm reports is actually true? If not, the next step is finding where exactly my algorithm is incorrect. Specifically, it would seem to me that now my auto-guessing algorithm is correct, since it matches Knuth's, but there must be something wrong in the original CheaterMind algorithm, which should produce a grade for the guess that would leave the highest number of possibilities remaining for the final code.
The revised guessing algorithm, with the same old cheating algorithm, currently gives these moves/grades for each of the different possible starting moves. (They're base-0. The guess is followed by the grade in format [ exactly right, right-but-wrong-place, completely wrong ]
and then the number of possibilities remaining after the move.
$ node extras/analysis.js
1 [ 0, 0, 0, 0 ] [ 0, 0, 4 ] 625
2 [ 1, 1, 2, 2 ] [ 1, 1, 2 ] 120
3 [ 1, 3, 1, 1 ] [ 0, 1, 3 ] 20
4 [ 2, 4, 2, 3 ] [ 1, 2, 1 ] 3
5 [ 2, 2, 3, 2 ] [ 3, 0, 1 ] 1
6 [ 2, 5, 3, 2 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 0, 1 ] [ 1, 0, 3 ] 317
2 [ 0, 2, 3, 2 ] [ 0, 1, 3 ] 55
3 [ 1, 3, 4, 1 ] [ 0, 1, 3 ] 10
4 [ 4, 0, 5, 4 ] [ 1, 3, 0 ] 2
5 [ 4, 4, 0, 5 ] [ 2, 2, 0 ] 1
6 [ 5, 4, 0, 4 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 1, 1 ] [ 0, 0, 4 ] 256
2 [ 2, 2, 3, 4 ] [ 1, 1, 2 ] 46
3 [ 2, 5, 2, 5 ] [ 0, 1, 3 ] 6
4 [ 3, 4, 5, 4 ] [ 1, 0, 3 ] 1
5 [ 3, 3, 3, 2 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 1, 2 ] [ 0, 1, 3 ] 276
2 [ 1, 3, 3, 4 ] [ 1, 1, 2 ] 51
3 [ 3, 3, 2, 5 ] [ 0, 1, 3 ] 6
4 [ 4, 1, 5, 4 ] [ 1, 1, 2 ] 1
5 [ 4, 4, 3, 0 ] [ 4, 0, 0 ] 1
1 [ 0, 1, 2, 3 ] [ 0, 2, 2 ] 312
2 [ 1, 4, 3, 4 ] [ 1, 0, 3 ] 54
3 [ 1, 5, 0, 1 ] [ 0, 1, 3 ] 8
4 [ 5, 3, 3, 2 ] [ 1, 1, 2 ] 1
5 [ 3, 0, 3, 0 ] [ 4, 0, 0 ] 1
For 4 slots and 6 colours, the min moves needed are 5, using starting code(s):
[[0,0,1,1],[0,0,1,2],[0,1,2,3]]
If the grading algorithm is flawed, then the above output does not represent the best-played worst case scenario for each starting move.
So, how is our grader deciding these at each stage? Maybe there are some ties happening somewhere, and the wrong tiebreaker is being chosen. We can confirm that's false by logging to the console if there are any ties.
First, let's confirm Knuth's results at each step for 1122 by comparing with Figure 1 in his paper. His first guess has 3 ways to narrow to 256 possibilities (B, C, or F). Our algorithm chose to grade [ 0, 0, 4 ], i.e., every colour is wrong, corresponding to C. The best guess corresponds there. Then [1, 1, 2] is the correct grade, giving the highest number, 46. The best guess again corresponds. At that point, Knuth gives 4 ways to narrow down to 6 possibilities:
[ 0, 1, 3 ]
[ 0, 0, 4 ]
[ 1, 1, 2 ]
[ 1, 0, 3 ]
Our algorithm grades [ 0, 1, 3 ]--the first on the list, even though at the first grading tiebreaker, the second was taken. But the best guess still corresponds at this point, and the next step is the win as Knuth predicts.
Wait, a grading tiebreaker? I thought we had ruled that out?
Let's log a bit more about the 'grade buckets' (each grade and how many possibilities it leaves) at each decision, then. At the first decision about grading the starting move 1122, no tiebreaker message is given, yet there's clearly a tie for best grade bucket:
0,0,4 would leave 256 possibilities
1,0,3 would leave 256 possibilities
0,1,3 would leave 256 possibilities
1,1,2 would leave 208 possibilities
2,0,2 would leave 114 possibilities
0,2,2 would leave 96 possibilities
1,2,1 would leave 36 possibilities
2,1,1 would leave 32 possibilities
3,0,1 would leave 20 possibilities
0,3,1 would leave 16 possibilities
2,2,0 would leave 4 possibilities
0,4,0 would leave 1 possibilities
Fortunately the 3-way tie does correspond to Knuth's C, B, and F, respectively--but there, the ordering is different. Knuth orders grades "least white pegs first, most black pegs next" where white is "right colour, wrong position" and black is "exactly right."
First, let's check why a tie isn't being reported. Well, that part is easy, I was comparing arrays rather than their lengths (the lengths being the number of possibilities.) OK, bug fixed, and tiebreaker question answered.
Now, let's see what results we get if we use Knuth's grade ordering when there is a tiebreaker. We need a new value function for grades. Least white pegs first: (slotCount + 1 - whitepegs) * (slotCount + 1)
will treat the white peg count as a sort of "10's digit" but in base "slotCount + 1". We then simply add blackpegs
and now we have a value to sort by:
const base = slotCount + 1;
const value = grade => (base - grade[1]) * base + grade[0];
In the case of our tiebreaker just above, this would give the values 25, 26, and 20, respectively, to the first three entries, matching the C, B, and F ordering from Knuth. So instead of C, let's take B, allowing the tie to be broken with the highest value of the above function.
OK, so our old "best worst case play" (with C) was:
1 [ 0, 0, 1, 1 ] [ 0, 0, 4 ] 256
2 [ 2, 2, 3, 4 ] [ 1, 1, 2 ] 46
3 [ 2, 5, 2, 5 ] [ 0, 1, 3 ] 6
4 [ 3, 4, 5, 4 ] [ 1, 0, 3 ] 1
5 [ 3, 3, 3, 2 ] [ 4, 0, 0 ] 1
And with B we get:
1 [ 0, 0, 1, 1 ] [ 1, 0, 3 ] 256
2 [ 0, 2, 3, 3 ] [ 0, 1, 3 ] 44
3 [ 2, 4, 1, 5 ] [ 1, 2, 1 ] 7
4 [ 2, 5, 2, 1 ] [ 1, 1, 2 ] 2
5 [ 3, 4, 5, 1 ] [ 1, 3, 0 ] 1
6 [ 3, 5, 1, 4 ] [ 4, 0, 0 ] 1
Hmm, something's wrong, that took 6 moves. Let's compare with Knuth's paper again. First line is OK, but then Knuth says the next guess should be 2344, meaning [1,2,3,3] above, but instead we find [0, 2, 3, 3]. Both guesses end up with a worst case grade narrowing to 44 possibilities, so they should be considered a tie, and subject to the guess order convention from the paper, which is that the numerically lowest guess is used. It would seem that Knuth is somehow not following his own stated convention, and yet by doing so, is arriving at a better guess to use.
We now take a moment for an interesting tangent. Knuth's next guess isn't among our ties:
1 [ 0, 0, 1, 1 ] [ 1, 0, 3 ] 256
code 5,5,4,1 ties for best count 44
code 5,5,3,1 ties for best count 44
code 5,5,2,1 ties for best count 44
code 5,5,1,4 ties for best count 44
code 5,5,1,3 ties for best count 44
code 5,5,1,2 ties for best count 44
code 5,0,4,4 ties for best count 44
code 5,0,3,3 ties for best count 44
code 5,0,2,2 ties for best count 44
code 4,4,5,1 ties for best count 44
code 4,4,3,1 ties for best count 44
code 4,4,2,1 ties for best count 44
code 4,4,1,5 ties for best count 44
code 4,4,1,3 ties for best count 44
code 4,4,1,2 ties for best count 44
code 4,0,5,5 ties for best count 44
code 4,0,3,3 ties for best count 44
code 4,0,2,2 ties for best count 44
code 3,3,5,1 ties for best count 44
code 3,3,4,1 ties for best count 44
code 3,3,2,1 ties for best count 44
code 3,3,1,5 ties for best count 44
code 3,3,1,4 ties for best count 44
code 3,3,1,2 ties for best count 44
code 3,0,5,5 ties for best count 44
code 3,0,4,4 ties for best count 44
code 3,0,2,2 ties for best count 44
code 2,2,5,1 ties for best count 44
code 2,2,4,1 ties for best count 44
code 2,2,3,1 ties for best count 44
code 2,2,1,5 ties for best count 44
code 2,2,1,4 ties for best count 44
code 2,2,1,3 ties for best count 44
code 2,0,5,5 ties for best count 44
code 2,0,4,4 ties for best count 44
code 2,0,3,3 ties for best count 44
code 0,5,4,4 ties for best count 44
code 0,5,3,3 ties for best count 44
code 0,5,2,2 ties for best count 44
code 0,4,5,5 ties for best count 44
code 0,4,3,3 ties for best count 44
code 0,4,2,2 ties for best count 44
code 0,3,5,5 ties for best count 44
code 0,3,4,4 ties for best count 44
code 0,3,2,2 ties for best count 44
code 0,2,5,5 ties for best count 44
code 0,2,4,4 ties for best count 44
code 0,2,3,3 ties for best count 44
2 [ 0, 2, 3, 3 ] [ 0, 1, 3 ] 44
Why not? Because of the assumptions of my guessing algorithm. The code [0,0,1,1] has one number exactly right, nothing else. The code can't start with 1, because then there would have been a 'right colour, wrong spot' in the grade. So I was considering it illogical to guess a code starting with 1...but that's exactly what Knuth suggests!
So, the guessing algorithm has an interesting logical flaw: apparently, you sometimes have to guess something that you already know cannot possibly be the right answer, because somehow this can more efficiently reduce the remaining number of moves.
I hadn't caught on earlier, but this is exactly what Knuth meant here:
You can see, from the second-last line, that the last line is what I had been considering illogical--and Knuth rightly considers to be brilliant. 3526 is given a grade of BWW, meaning at least three of those numbers must be in the final code. 1462 only shares the 2 and the 6, so 1462 can't possibly be the final code, and yet, it's the best guess in this case, because it eliminates more possibilities than any of the actual final code candidates could.
OK, so if we check every possibility when looking for our best guess, not just ones that could be the answer, we get a different analysis. It's still not the same as Knuth's conclusion, though:
1 [ 0, 0, 0, 0 ] [ 0, 0, 4 ] 625
2 [ 1, 1, 2, 2 ] [ 1, 1, 2 ] 120
3 [ 1, 1, 1, 3 ] [ 0, 0, 4 ] 20
4 [ 2, 4, 2, 4 ] [ 3, 0, 1 ] 3
5 [ 0, 0, 0, 5 ] [ 1, 0, 3 ] 1
6 [ 2, 4, 2, 5 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 0, 1 ] [ 1, 0, 3 ] 317
2 [ 0, 2, 3, 2 ] [ 0, 1, 3 ] 55
3 [ 2, 4, 5, 0 ] [ 1, 0, 3 ] 8
4 [ 0, 1, 5, 5 ] [ 1, 1, 2 ] 2
5 [ 0, 0, 1, 1 ] [ 1, 0, 3 ] 1
6 [ 3, 3, 5, 1 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 1, 1 ] [ 1, 0, 3 ] 256
2 [ 0, 2, 3, 3 ] [ 0, 1, 3 ] 44
3 [ 0, 4, 1, 4 ] [ 2, 0, 2 ] 7
4 [ 3, 4, 0, 5 ] [ 3, 0, 1 ] 1
5 [ 3, 4, 1, 5 ] [ 4, 0, 0 ] 1
1 [ 0, 0, 1, 2 ] [ 0, 1, 3 ] 276
2 [ 0, 3, 3, 4 ] [ 0, 1, 3 ] 51
3 [ 4, 1, 5, 1 ] [ 1, 1, 2 ] 6
4 [ 2, 2, 4, 4 ] [ 0, 0, 4 ] 1
5 [ 1, 5, 5, 3 ] [ 4, 0, 0 ] 1
1 [ 0, 1, 2, 3 ] [ 0, 2, 2 ] 312
2 [ 1, 4, 3, 4 ] [ 1, 0, 3 ] 54
3 [ 1, 0, 4, 5 ] [ 2, 0, 2 ] 8
4 [ 0, 0, 0, 1 ] [ 1, 0, 3 ] 2
5 [ 0, 0, 3, 3 ] [ 2, 0, 2 ] 1
6 [ 5, 0, 3, 5 ] [ 4, 0, 0 ] 1
For 4 slots and 6 colours, the min moves needed are 5, using starting code(s):
[[0,0,1,1],[0,0,1,2]]
On the surface, it's a bit closer, though: Knuth's declaration of 1234 being impossible to guarantee a 5-move win now holds up; and my algorithm now gives the proper result of 5 for 1122. I guess it remains to be seen whether 1123 is as good as 1122.
http://www-cs-faculty.stanford.edu/~uno/fg.html links to some analyses like that of Figure 1 in the original paper, but done by Tom Nestor in a more extensive strategy analysis. In every one of the files linked to from there that show an analysis by Nestor, 1123 is actually the recommended starting move! Since I don't have access to the book yet, I can't say further why my algorithm might correspond in this way, but meanwhile, we could do a comparison between Nestor's various files and my algorithm, to find out which (if any) it corresponds to.
Meanwhile, my algorithm still must be missing something, because in some cases I am getting higher "minimum number of turns in the worst case" values than given here. But it's a bit hard to tell, since "these were generated by computer search" doesn't give enough detail to analyze like we had with Knuth's paper and now have with Nestor's strategies.