Minimaxer Part 2 - Alpha-beta pruning and iterative deepening

Table of Contents

Improving on the minimax algorithm with alpha-beta pruning, iterative deepening, time-based searches and more.

Part 1 - Minimax and negamax
Part 2 - Alpha-beta pruning and iterative deepening
Part 3 - Mancala AI and benchmarking
Part 4 - Code Optimisation

This is the second post in a series about the minimax typescript library I have created. Part 1 looked at representing the game tree, making it generic to work with any game and the core minimax algorithm.

This post will look at the most common improvement to minimax: alpha-beta pruning. Also included are some other features that make the search for the best move in a game more useful and more efficient.

Minimaxer

Iterative deepening #

At first glance iterative deepening seems quite useless. The idea is that instead of immediately searching to a certain depth, e.g 3 moves ahead, the algorithm will first search to depth 1, then 2 and finally 3. A lot of extra work for exactly the same result. Here is what it may look like as a method on the game tree:

class Tree<GS, M> {
    ...
    minimax (node: Node<GS, M>, depth:number ) {
        ...
    }

    deepening(max_depth:number) {
        // iterate through depths
        for (let depth = 1; depth <= max_depth; depth ++){
            this.minimax(tree.root, depth)
        }
    }

}

As before, the value and best move are available at tree.root.child.value and tree.root.child.move respectively.

But why bother? Read on and find out…

Time based search #

Imagine using the minimax algorithm to create a game-playing AI. How do you decide the search depth when picking a move? At the start of the game it may take a a few seconds to search 2 moves ahead, but towards the end it maybe be able to search 5-6 moves How do you find a balance? And what about the performance difference between different devices?

The solution is to limit the time taken to search, rather than the depth. And this is where iterative deepening comes in. The tree is searched progressively deeper until a time limit expires, and the result from the previous depth is used.

First I shall define an new enum to track the reason for a search exiting:

export const enum SearchExit {
    /**At least 1 path did not reach a leaf node */
    DEPTH,
/** All paths reached leaf nodes*/
    FULL_DEPTH,
    /** Searched concluded because of timeout */
    TIME,
}

Then modify the minimax algorithm a little bit:

class Tree<GS, M> {
    fulldepth = true;
    expireTime = 0;
    ...
    minimax (node: Node<GS, M>, depth:number ): SearchExit {
        // Check if node is LEAF or reached depth
        if (node.type == NodeType.LEAF || depth == 0){
            node.value = this.EvaluateNode(node)
            if (!this.fulldepth){
                return SearchExit.DEPTH;
            } else if (node.type != NodeType.LEAF){
                this.fulldepth = false
                return SearchExit.DEPTH;
            } else {
                return SearchExit.FULL_DEPTH;
            }
        } else {
            // Check for expiry time
            if (this.expireTime && Date.now() >= this.expireTime){
                return SearchExit.TIME;
            }
            // Create all the children for the node
            this.createChildren(node)
            // Iterate through children to find the best (aim dependant)
            let best = node.children[0];
            for (const child of node.children){
                // Recursive minimax call at depth - 1
                if (this.minimax(child, depth-1) == SearchExit.TIME){
                    return SearchExit.TIME;
                }
                // Check if child is better
                if (node.type == NodeType.MAX){
                    if (child.value > best.value){
                        best = child
                    }
                } else {
                    if (child.value < best.value){
                        best = child
                    }
                }
            }
            // Assign the best child and value to parent
            node.child = child
            node.value = child.value
        }
    }

}

The minimax function now checks for a timeout before searching child nodes. If a timeout occurs, this is rapidly passed up through the recursive function calls and the search exits at that depth without assigning a new value or best move to the root.

Also added is a way of marking a search as full-depth, i.e that no further moves can be analysed. This is useful for stopping the search going for longer than necessary.

Finally, a function is added to perform a time-limited search:

class Tree<GS, M> {
    ...

    deepening_time(timeout:number) {
        this.expireTime = Date.now() + timeout;
        this.fulldepth = true;
        // iterate through depths
        for (let depth = 1; ; depth ++){
            const exit = this.minimax(tree.root, depth);
            if (exit == SearchExit.TIME || exit == SearchExit.FULL_DEPTH){
                return;
            }
        }
    }

}

This first sets up the expiry time and full depth flag, then iterates deeper and deeper until a timeout or full depth search occurs.

The way it is implemented, at least a depth 1 search is guaranteed, so an evaluation and move are always assigned to root.

Alpha-beta pruning #

The minimax algorithm requires searching through every node in the tree. For each extra turn that is considered, the number of nodes increases exponentially. Clearly reducing the number of nodes to search could have a big impact on performance, which is exactly what alpha-beta pruning does.

The principle behind it is to keep track of the best value that can be achieved by each player: alpha is the minimum score the maximising player is guaranteed, beta is the maximum score the minimising player is guaranteed. The initial values are -infinity and infinity respectively.

If at any point during the tree search the maximising player finds a move/child that is valued greater than beta, the search from that parent is stopped. This is because the minimising player on the previous move already has a better option, so will not select that node as a child node during its search. Equally, if the minimising player ever finds a node worth less than alpha, it will stop its search.

It is important to point out that this has no effect on the final result. The best move and associated value will be exactly the same as if doing a full minimax search.

Below is the minimax function with alpha-beta pruning. I have removed some parts of the function to focus on the changes.

class Tree<GS, M> {
    ...
    minimax (node: Node<GS, M>, depth:number, alpha:number, beta:number): SearchExit {
        // Check if node is LEAF or reached depth
        if (node.type == NodeType.LEAF || depth == 0){
            ...
        } else {
            ...
            // Iterate through children to find the best (aim dependant)
            let best = node.children[0];
            for (const child of node.children){
                // Recursive minimax call at depth - 1
                if (this.minimax(child, depth-1, alpha, beta) == SearchExit.TIME){
                    return SearchExit.TIME;
                }
                // Check if child is better
                if (node.type == NodeType.MAX){
                    if (child.value > best.value){
                        best = child
                        if (best.value > beta){
                            break;
                        }
                        alpha = Math.max(alpha, best.value);
                    }
                } else {
                    if (child.value < best.value){
                        best = child
                        if (best.value < alpha){
                            break;
                        }
                        beta = Math.min(beta, best.value);
                    }
                }
            }
            // Assign the best child and value to parent
            node.child = child
            node.value = child.value
        }
    }

}

Alpha-beta pruning example #

Consider the game tree below. The first player is the maximiser, the second is the minimiser. Highlighted in yellow is the best move that the minimising player would choose after searching the tree below the first move.

graph TB; A((" ")) A-->E(("3")) A-->B((" ")) A-->D((" ")) E-->H((4)) E-->F((5)) E-->G((3)):::min B-->I((1)) B-->J(("-1")) D-->L(("-8")) D-->M(("-4")) D-->N(("-5")) classDef max fill:#f9f classDef min fill:#ff9 classDef prune fill:#666 linkStyle 5 stroke:#ff9

Now when searching the first player’s second move, the value of alpha is 3, since that is the minimum they can guarantee scoring. When the moves below the second move are searched, the move resulting in a value of -1 is not considered as, 1 is less than 3.

graph TB; A((" ")) A-->E(("3")) A-->B(("1")) A-->D((" ")) E-->H((4)) E-->F((5)) E-->G((3)):::min B-->I((1)):::min B-->J(("-1")):::prune D-->L(("-8")) D-->M(("-4")) D-->N(("-5")) classDef max fill:#f9f,stroke:#333,stroke-width:1px; classDef min fill:#ff9,stroke:#333,stroke-width:1px; classDef prune fill:#666,stroke:#333,stroke-width:1px; linkStyle 5 stroke:#ff9 linkStyle 6 stroke:#ff9

Then during the rest of the search, 2 more nodes can be pruned as alpha is still 3, and -8 is less than 3. The final tree with the maximising players selection in pink is shown below

graph TB; A(("3")) A-->E(("3")):::max A-->B(("1")) A-->D(("-8")) E-->H((4)) E-->F((5)) E-->G((3)):::min B-->I((1)):::min B-->J(("-1")):::prune D-->L(("-8")):::min D-->M(("-4")):::prune D-->N(("-5")):::prune classDef max fill:#f9f,stroke:#333,stroke-width:1px; classDef min fill:#ff9,stroke:#333,stroke-width:1px; classDef prune fill:#666,stroke:#333,stroke-width:1px; linkStyle 0 stroke:#f9f linkStyle 5 stroke:#ff9 linkStyle 6 stroke:#ff9 linkStyle 8 stroke:#ff9

Alpha-beta pruning with negamax #

Alpha-beta pruning can be used with negamax fairly easily. Since negamax is always considering a maximising player, it is always looking to increase the value of alpha and check against a value of beta for pruning. Since the maximising and minimising players alternate turns, this means changing both the role and the sign of alpha and beta with each depth of search:

class Tree<GS, M> {
    ...
    negamax (node: Node<GS, M>, depth:number, aim:NodeAim, alpha:number, beta:number) {
        // Check if node is LEAF or reached depth
        if (node.type == NodeType.LEAF || depth == 0){
            node.value = - aim * this.EvaluateNode(node)
        } else {
            // Create all the children for the node
            this.createChildren(node)
            // Iterate through children to find the best (aim dependant)
            let best = node.children[0];
            for (const child of node.children){
                // Recursive minimax call at depth - 1
                this.negamax(child, depth-1, -aim, -beta, -alpha)
                // Check if child is better
                if (child.value > best.value){
                    best = child
                    // alpha-beta check
                    if (best.value > beta){
                        break;
                    }
                    alpha = Math.max(alpha, best.value);
                }

            // Assign the best child and value to parent
            node.child = child
            node.value = -child.value
        }
    }

}

Its a bit of a mind-bender to think about, but it must work, its on wikipedia.

Here is how the previous example would look with negamax alpha-beta pruning:

graph TB; A(("-3")) A-->E(("3")):::max A-->B(("1")) A-->D(("-8")) E-->H(("-4")) E-->F(("-5")) E-->G(("-3")):::min B-->I(("-1")):::min B-->J(("1")):::prune D-->L(("8")):::min D-->M(("4")):::prune D-->N(("5")):::prune classDef max fill:#f9f,stroke:#333,stroke-width:1px; classDef min fill:#ff9,stroke:#333,stroke-width:1px; classDef prune fill:#666,stroke:#333,stroke-width:1px; linkStyle 0 stroke:#f9f linkStyle 5 stroke:#ff9 linkStyle 6 stroke:#ff9 linkStyle 8 stroke:#ff9

Note how the nodes on the 1st and 3rd level are the negative of those in the previous example, due to the alternating sign of the negamax search.

Pre-sorting #

One insight you may have spotted when looking at alpha-beta pruning is how important move ordering is. In the examples, the tree is being searched left to right, in code this is from the start of an array to the end. If the best moves are searched first, the algorithm will have more opportunities to prune nodes, potentially improving performance.

So it is very important to have the moves sorted well before searching the tree, but I consider it out of scope for the library to get involved with move sorting, the users can do that.

But for some games move sorting isn’t possible, or might take so long that it actually performs worse than searching more nodes. Is there something else the library can do to help?

Iterative Deepening strikes again #

With iterative deepening it is possible to take advantage of the fact that nodes already have values assigned to them from the previous depth’s search and sort them based on it.

For example after a depth 1 search, the evaluations for each move may look like this:

graph TB; A((4)) A-->B((2)) A-->C(("-1")); A-->D(("-5")) A-->E((4)):::highlight classDef highlight fill:#f9f,stroke:#333,stroke-width:1px; linkStyle 3 stroke:#f9f

Since its the maximising player’s move, the move with a value of 4 is best. Then after sorting the root’s children, and searching to depth 2, the tree may look like this:

graph TB; A((3)) A-->E((3)):::highlight A-->B(("1")) A-->C(("2")); A-->D(("-8")) E-->F((5)) E-->G((3)):::min E-->H((4)) B-->I((1)):::min B-->J(("-1")):::prune C-->K((2)):::min D-->L(("-8")):::min D-->M(("-4")):::prune D-->N(("-5")):::prune classDef highlight fill:#f9f,stroke:#333,stroke-width:1px; classDef min fill:#ff9,stroke:#333,stroke-width:1px; classDef prune fill:#666,stroke:#333,stroke-width:1px; linkStyle 0 stroke:#f9f linkStyle 5 stroke:#ff9

3 nodes have been pruned, the most possible with this game tree.

For the depth 3 search, both the roots children and all of their children will be sorted. This is what the tree looks like after sorting, but before searching to depth 3.

graph TB; A((3)) A-->E((3)) A-->C(("2")); A-->B(("1")) A-->D(("-8")) E-->G((3)) E-->H((4)) E-->F((5)) B-->I((1)) B-->J(("-1")):::prune C-->K((2)) D-->L(("-8")) D-->M(("-4")):::prune D-->N(("-5")):::prune classDef prune fill:#666,stroke:#333,stroke-width:1px;

Notice how the nodes valued at -4 and -5 have not been sorted. This is because the search never reached them at the previous depth, so their value is effectively unknown.

Conclusion #

Alpha-beta pruning promises improvements over standard minimax by reducing the number of nodes searched. With iterative deepening and node sorting, it may be even better. Combined with time-based searches instead of depth-based, there is all the makings of an ‘AI’ game player. But how do you actually implement one and how well does it perform?

Part 3 will have an AI that can play the ancient and popular game mancala, also with performance benchmarking!