Minimaxer Part 4 - Code optimisation

Table of Contents

Making minimax run Blazingly Fast ™ with generators, faster loops and smarter sorting.

Part 1 - Minimax and negamax
Part 2 - Alpha-beta pruning and iterative deepening
Part 3 - Mancala AI and benchmarking
Part 4 - Code optimisation

Intro #

So far on the development journey: the core minimax algorithm works, and some very effective improvements have been implemented, with alpha-beta pruning, iterative deepening and pre-sorting child nodes.

Next up is tweaking the code to try extract even more performance, and crush any puny humans that dare take on the machines.

Minimaxer

The Node generator #

In the current implementation, each call to the minimax or negamax functions (depth > 1) requires all children to be created before they are iterated over. When searching the tree in its entirety, this is a perfectly reasonable approach.

However, with alpha-beta pruning, it may not be unnecessary to search every possible child of a node. The operations required to create a child involve cloning the gamestate, playing a move and possibly evaluating the new gamestate, which for some games could be very costly.

A better alternative is to create the child nodes only when they are needed. The child creation is moved inside the main loop, which can be broken early if the pruning conditions occur:

class Tree<GS, M> {
    ...
    negamax (node: Node<GS, M>, depth:number, aim:NodeAim ) {
        ...
            for (const move of node.moves){
                // This function creates child node, adds it to
                // node.children and returns the child
                const child = this.createChild(node, move);
                // evaluate child, alpha-beta prune etc.
                ...
            }
        ...
        }
    }
}

This is fine for a single search, but when using iterative deepening it is a lot of duplicated work and extra child nodes will be created. Ideally we first want to search through the children that have previously been created, then create new ones if required, which are saved for any further searches.

Generators #

To achieve this goal, I shall use a generator function.

A generator function looks like a normal function, but with an asterisk proceeding its name, and the use of the yield keyword. Calling the function does not execute the code inside, but returns a generator object that can be iterated over.

Each call to the next method of the generator object runs the generator function, returning the variable referred to by the yield statement. The function state is saved, and then continued on the next call to next, again until yielding.

The generator function for lazily returning a nodes children looks like this:

// Method on Tree class
protected *childGen(node: Node<GS, M>): Generator<Node<GS, M>> {
        // Yield children already created first
        for (const child of node.children) {
            yield child;
        }
        // Then create new ones from moves
        const nMoves = node.moves.length;
        while (node.moveInd < nMoves) {
            const move = node.moves[node.moveInd];
            const child = this.CreateChildNode(node, move);
            child.parent = node;
            child.move = move;
            node.children.push(child);
            node.moveInd++;
            this.nodeCount++;
            yield child;
        }
    }

Basically it returns all the children already attached to the node, then creates new ones until there are no moves remaining. The index of the next move to use to create a child is tracked with the node.moveInd property.

It is not required to call the next method of a generator object over and over, instead it can be iterated over naturally with a for-of loop.

class Tree<GS, M> {
    ...
    negamax (node: Node<GS, M>, depth:number, aim:NodeAim ) {
        ...
            const childGen = this.childGen(node);
            for (const child of childGen){
                // evaluate child, alpha-beta prune etc.
                ...
            }
        ...
        }
    }
}

How does it perform? #

Stunningly well.

This is mancala to depth 5 and 6 as per previous post:

Config	Depth 5 (op/s)	Depth 6 (op/s)
Standard	320	15
Alpha-beta deepening with presort	4664	1015
Alpha-beta deepening with presort + generator	5611	1991

Thats a 20% and 96% improvement for each depth respectively. Not only that, it will also consume less memory as fewer nodes have been created.

Child sorting #

The fastest configuration for the minimax search relies on alpha-beta pruning run at successively deeper depths, sorting the list of children before each node search to maximise pruning chances.

Due to iterative deepening, it is possible for a node to have children with values assigned to them from different depth searches. The depth a value is assigned to a node is tracked with the inheritedDepth property on the node, and the algorithm ensures that when sorting, only the most recently assigned values are considered.

Since the children with the most recently assigned values will appear first in the list, only this subset of the list actually requires sorting.

This is achieved using bubble sort where the inner loop is stopped whenever a node with a smaller inheritedDepth is reached:

// Sort using bubblesort, stopping at nodes that havent been updated
export function bubbleSort(
    list: Node<unknown, unknown>[],
    reverse = false,
) {
    for (;;) {
        // Mark as not changed for this pass
        let changed = false;
        // Iterate through list
        for (let i = 1; i < list.length; i++) {
            // Get a and b children
            const a = list[i - 1];
            const b = list[i];
            // Check if reached un updated nodes
            if (b.inheritedDepth > a.inheritedDepth) {
                // Stop this iteration
                break;
            }
            // Switch a and b if condition true
            if ((!reverse && b.inheritedValue > a.inheritedValue) ||
            (reverse && b.inheritedValue < a.inheritedValue)) {
                list[i - 1] = b;
                list[i] = a;
                changed = true;
            }
        }
        // Stop if nothing changed this iteration
        if (!changed) {
            break;
        }
    }
}

Performance of bubble sort #

Config	Depth 5 (op/s)	Depth 6 (op/s)
Standard	320	15
Alpha-beta deepening with presort	4664	1015
Alpha-beta deepening with presort + generator	5611	1991
As above + bubble sort	5890	1456

Interestingly, the efficient bubble sort is marginally quicker at depth 5, but slower at depth 6. This is likely because more sorts were required with all the children at depth 6, and bubble sort is slower than the built-in sort. Maybe a quicker sorting algorithm capable of early exit could improve here.

Optimal function #

The algorithm as it is currently implemented is very configurable, with optional pruning, sorting and generators, and even different sorting algorithms. But every recursive call to the minimax/negamax function contains branches and function calls to check for and implement the different options. What if there is a function with all the most performant options but no configuration available?

Config	Depth 5 (op/s)	Depth 6 (op/s)
Standard	320	15
Alpha-beta deepening with presort	4664	1015
Alpha-beta deepening with presort + generator	5611	1991
As above + bubble sort	5890	1456
Optimal function	6453	1765

The optimal function uses alpha-beta pruning, iterative deepening and pre sort with efficient bubble sort, the same as the test above it. Without all the extra branching and function calls, it shows a 10 to 21% improvement.

Conclusion #

In the 2 examples I have been using, the fastest version of the negamax algorithm is now 20x and over 100x faster than the standard implementation. Most of the improvements are due to effective alpha-beta pruning, but the code changes have made a significant impact as well.

The next post will look at expanding the minimax algorithm to games with more than 2 players.