Minimaxer Part 4 - Code optimisation
Table of Contents
- Part 1 - Minimax and negamax
- Part 2 - Alpha-beta pruning and iterative deepening
- Part 3 - Mancala AI and benchmarking
- Part 4 - Code optimisation
Intro #
So far on the development journey: the core minimax algorithm works, and some very effective improvements have been implemented, with alpha-beta pruning, iterative deepening and pre-sorting child nodes.
Next up is tweaking the code to try extract even more performance, and crush any puny humans that dare take on the machines.
Minimaxer
The Node generator #
In the current implementation, each call to the minimax
or negamax
functions (depth > 1) requires all children to be created before they are iterated over. When searching the tree in its entirety, this is a perfectly reasonable approach.
However, with alpha-beta pruning, it may not be unnecessary to search every possible child of a node. The operations required to create a child involve cloning the gamestate, playing a move and possibly evaluating the new gamestate, which for some games could be very costly.
A better alternative is to create the child nodes only when they are needed. The child creation is moved inside the main loop, which can be broken early if the pruning conditions occur:
class Tree<GS, M> {
...
negamax (node: Node<GS, M>, depth:number, aim:NodeAim ) {
...
for (const move of node.moves){
// This function creates child node, adds it to
// node.children and returns the child
const child = this.createChild(node, move);
// evaluate child, alpha-beta prune etc.
...
}
...
}
}
}
This is fine for a single search, but when using iterative deepening it is a lot of duplicated work and extra child nodes will be created. Ideally we first want to search through the children that have previously been created, then create new ones if required, which are saved for any further searches.
Generators #
To achieve this goal, I shall use a generator function.
A generator function looks like a normal function, but with an asterisk proceeding its name, and the use of the yield keyword. Calling the function does not execute the code inside, but returns a generator object that can be iterated over.
Each call to the next
method of the generator object runs the generator function, returning the variable referred to by the yield statement. The function state is saved, and then continued on the next call to next
, again until yielding.
The generator function for lazily returning a nodes children looks like this:
// Method on Tree class
protected *childGen(node: Node<GS, M>): Generator<Node<GS, M>> {
// Yield children already created first
for (const child of node.children) {
yield child;
}
// Then create new ones from moves
const nMoves = node.moves.length;
while (node.moveInd < nMoves) {
const move = node.moves[node.moveInd];
const child = this.CreateChildNode(node, move);
child.parent = node;
child.move = move;
node.children.push(child);
node.moveInd++;
this.nodeCount++;
yield child;
}
}
Basically it returns all the children already attached to the node, then creates new ones until there are no moves remaining. The index of the next move to use to create a child is tracked with the node.moveInd
property.
It is not required to call the next
method of a generator object over and over, instead it can be iterated over naturally with a for-of loop.
class Tree<GS, M> {
...
negamax (node: Node<GS, M>, depth:number, aim:NodeAim ) {
...
const childGen = this.childGen(node);
for (const child of childGen){
// evaluate child, alpha-beta prune etc.
...
}
...
}
}
}
How does it perform? #
Stunningly well.
This is mancala to depth 5 and 6 as per previous post:
Config | Depth 5 (op/s) | Depth 6 (op/s) |
---|---|---|
Standard | 320 | 15 |
Alpha-beta deepening with presort | 4664 | 1015 |
Alpha-beta deepening with presort + generator | 5611 | 1991 |
Thats a 20% and 96% improvement for each depth respectively. Not only that, it will also consume less memory as fewer nodes have been created.
Child sorting #
The fastest configuration for the minimax search relies on alpha-beta pruning run at successively deeper depths, sorting the list of children before each node search to maximise pruning chances.
Due to iterative deepening, it is possible for a node to have children with values assigned to them from different depth searches. The depth a value is assigned to a node is tracked with the inheritedDepth
property on the node, and the algorithm ensures that when sorting, only the most recently assigned values are considered.
Since the children with the most recently assigned values will appear first in the list, only this subset of the list actually requires sorting.
This is achieved using bubble sort where the inner loop is stopped whenever a node with a smaller inheritedDepth
is reached:
// Sort using bubblesort, stopping at nodes that havent been updated
export function bubbleSort(
list: Node<unknown, unknown>[],
reverse = false,
) {
for (;;) {
// Mark as not changed for this pass
let changed = false;
// Iterate through list
for (let i = 1; i < list.length; i++) {
// Get a and b children
const a = list[i - 1];
const b = list[i];
// Check if reached un updated nodes
if (b.inheritedDepth > a.inheritedDepth) {
// Stop this iteration
break;
}
// Switch a and b if condition true
if ((!reverse && b.inheritedValue > a.inheritedValue) ||
(reverse && b.inheritedValue < a.inheritedValue)) {
list[i - 1] = b;
list[i] = a;
changed = true;
}
}
// Stop if nothing changed this iteration
if (!changed) {
break;
}
}
}
Performance of bubble sort #
Config | Depth 5 (op/s) | Depth 6 (op/s) |
---|---|---|
Standard | 320 | 15 |
Alpha-beta deepening with presort | 4664 | 1015 |
Alpha-beta deepening with presort + generator | 5611 | 1991 |
As above + bubble sort | 5890 | 1456 |
Interestingly, the efficient bubble sort is marginally quicker at depth 5, but slower at depth 6. This is likely because more sorts were required with all the children at depth 6, and bubble sort is slower than the built-in sort. Maybe a quicker sorting algorithm capable of early exit could improve here.
Optimal function #
The algorithm as it is currently implemented is very configurable, with optional pruning, sorting and generators, and even different sorting algorithms. But every recursive call to the minimax/negamax function contains branches and function calls to check for and implement the different options. What if there is a function with all the most performant options but no configuration available?
Config | Depth 5 (op/s) | Depth 6 (op/s) |
---|---|---|
Standard | 320 | 15 |
Alpha-beta deepening with presort | 4664 | 1015 |
Alpha-beta deepening with presort + generator | 5611 | 1991 |
As above + bubble sort | 5890 | 1456 |
Optimal function | 6453 | 1765 |
The optimal function uses alpha-beta pruning, iterative deepening and pre sort with efficient bubble sort, the same as the test above it. Without all the extra branching and function calls, it shows a 10 to 21% improvement.
Conclusion #
In the 2 examples I have been using, the fastest version of the negamax algorithm is now 20x and over 100x faster than the standard implementation. Most of the improvements are due to effective alpha-beta pruning, but the code changes have made a significant impact as well.
The next post will look at expanding the minimax algorithm to games with more than 2 players.