Full Text 01
Full Text 01
LUDWIG FRANKLIN
HUGO MALMBERG
LUDWIG FRANKLIN
HUGO MALMBERG
Abstract
Artificial intelligence has become more prevalent during the last few years,
revolutionizing the field of computer game-playing. By incorporating artificial
intelligence as a computerized opponent, games can become more engaging
and challenging for human players.
The minimax algorithm when applied to a two-player turn-based game
looks a given number of steps ahead into the future and determines which
move leads to the best scenario, given that the opponent plays optimally. The
algorithm can be applied to a wide range of different games, the algorithm
itself is quite simple and depending on the type of game it is often hard
to perform better in the game than an AI using this algorithm. One big
disadvantage of the algorithm however is that it is relatively slow since the
amount of possible scenarios it has to consider often grows exponentially with
each turn.
This thesis presents an analysis of various strategies to accelerate the
performance of the minimax algorithm, with a particular focus on versions
with and without alpha-beta pruning. The study further explores the
performance implications of parallelization, examines how the optimal move
selection rate is influenced by search depth, and assesses whether dynamic
search depth offers a viable means of balancing optimal move selection rate
and execution speed.
The results derived from the study indicate that alpha-beta pruning is
more effective with higher search depths. It also indicated that alpha-beta
pruning allowed for one step deeper searches compared to the standard
minimax algorithm, for almost no added computational cost. The impact
of parallelization varied, proving beneficial for deeper searches in the non-
pruning version but had almost no impact on the alpha-beta pruned version.
The optimal move selection rate did increase with added depth, but more data
is needed for conclusive results. In regards to dynamic search depth, we found
it only proved effective for the standard minimax algorithm and that it is better
to use alpha-beta pruning.
ii | Sammanfattning
Sammanfattning
Artificiell intelligens har blivit mer vanligt under de senaste åren, och det
har påverkat datorspelsbranchen rejält. Genom att låta artificiell intelligens
styra motståndare i spelen, så kan de bli mer engagerande och utmanande för
spelarna.
Minimax-algoritmen när den appliceras på en två-spelar turbaserat spel
kan undersöka ett givet antal steg in i framtiden och avgöra vilket drag
som leder till den bästa resultatet, givet att motståndaren spelar optimalt.
Algoritmen kan appliceras på en mängd olika spel, algoritmen i sig är ganska
simpel och beroende på vilken typ av spel det är kan det ofta vara svårt att
prestera bättre i spelet än en AI som använder denna algoritm. En stor nackdel
med algoritmen är dock att den är relativt långsam eftersom mängden möjliga
scenarier den måste överväga ofta växer exponentiellt med varje tur.
Denna rapport presenterar en analys av olika strategier för att förbättra
prestandan av minimax-algoritmen, med ett särskilt fokus på versioner med
och utan alpha-beta pruning. Studien undersöker vidare prestandakonsekven-
serna av parallellisering, undersöker hur valet av optimala drag påverkas av
sökdjupet och bedömer om dynamiskt sökdjup kan användas för att balansera
valet av optimala drag och hastighet.
Resultaten som härleds från studien indikerar att alpha-beta pruning är
mer effektivt vid högre sökdjup. Det indikerade också att alpha-beta pruning
tillät en steg djupare sökningar jämfört med standard minimax-algoritmen, för
nästan ingen extra beräkningskostnad. Påverkan av parallellisering varierade
mellan versionerna, det var fördelaktigt för djupare sökningar i den icke-
prunade versionen men hade nästan ingen inverkan på alpha-beta prunade
versionen. Valet av optimala drag ökade med större djup, men mer data behövs
för ett definitivt resultat. Vad gäller dynamiskt sökdjup fann vi att det endast
var effektivt för standard minimax-algoritmen och att det är bättre att använda
alpha-beta pruning istället.
Contents | iii
Contents
1 Introduction 1
1.1 Problem Statement and Purpose . . . . . . . . . . . . . . . . 1
1.2 Scope and Limitations . . . . . . . . . . . . . . . . . . . . . 2
2 Background 3
2.1 Game Theory in Artificial Intelligence . . . . . . . . . . . . . 3
2.2 The Minimax Algorithm . . . . . . . . . . . . . . . . . . . . 3
2.2.1 α-β Pruning . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Variable Search Depth . . . . . . . . . . . . . . . . . 4
2.2.3 Parallelizing the Minimax Algorithm . . . . . . . . . 5
2.3 Fox-Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.1 The Game . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Method 8
3.1 The Fox-game . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Minimax Algorithm . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.3 Parallelizing the Minimax Algorithm . . . . . . . . . 10
3.3 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Testing for Computation Required . . . . . . . . . . . 10
3.3.2 Testing for Time Required . . . . . . . . . . . . . . . 10
3.3.3 Testing for Optimal Move Selection Rate . . . . . . . 11
3.4 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5 Discussion 16
5.1 Discussing the Results . . . . . . . . . . . . . . . . . . . . . 16
5.1.1 Looking at Node Count . . . . . . . . . . . . . . . . . 16
5.1.2 Parallelization . . . . . . . . . . . . . . . . . . . . . 17
5.1.3 Information Gain of Each Added Depth . . . . . . . . 18
5.1.4 Analysis of Variable Depth . . . . . . . . . . . . . . . 18
5.2 How Our Methods Affect the Results . . . . . . . . . . . . . . 19
5.3 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Conclusion 21
References 23
List of Figures | v
List of Figures
List of Tables
Chapter 1
Introduction
Artificial intelligence (AI) has become more prevalent during the last few
years, revolutionizing the field of computer game-playing. By incorporating
AI as a computerized opponent, games can become more engaging and
challenging for human players. Typically, a computerized opponent consists
of various logical functional modules, with look-ahead being one of the most
commonly used. Look-ahead is a type of AI most utilized for two-player board
games and it is for example used in the famous chess-engine Stockfish. The
algorithm analyzes each possible future outcome and returns the optimal one
in regards to predetermined heuristics, given that the opponent plays optimally.
In this thesis, we apply the look-ahead algorithm to the Fox-game, which
is an old Scandinavian board game where one player controls the hens whose
goal is to move from one end of the board to the other, while the other player
controls the foxes and aims to stop the hens from achieving their goal by
jumping over them and thus removing them from the board.
Chapter 2
Background
may grow rapidly. The algorithm has a time complexity of O(bm), where b is
the game trees branching factor and m is the maximum depth [5].
The minimax algorithm is a recursive algorithm used in game theory and
artificial intelligence to calculate the optimal move for a player in a two-player,
zero-sum game. It involves searching through a tree of nodes, where a node
is a possible game-board and pieces. The player controlling the foxes wants
to minimize the amount of points, the hens want to maximize the points.
Therefore the algorithm is named “minimax”. The algorithm assumes that
both players will always make the best move for themselves [6].
deep into the tree would require immense computational resources, therefore
a shallower search depth may be required to limit the computation time. On
the other hand, if the game is in a state where there are only a few possible next
states, a deeper search would be computationally possible and would allow the
algorithm to give more accurate results.
2.3 Fox-Game
2.3.1 The Game
Fox-game is an old scandinavian 2-player board game, where one player plays
as the foxes, and the other as the hens.
6 | Background
The board layout consists of five three by three squares, where the edge
points are connected either vertically or horizontally to its closest edge point
neighbor and the middle point is connected to all other points in the square.
These squares are then combined to form a plus-sign shaped board layout. One
of these squares make up the coop.
There are two fox pieces and 20 hen pieces, and the objective for the hens
is to reach their coop on the other side of the board whilst the foxes try to stop
them. The foxes start off from the two bottom corners of the coop, whilst the
hens start from the opposite side of the board, filling up the top 4 rows. The
hens are only allowed to move forward or sideways whilst the foxes may move
in any direction.
If a hen is placed in between a fox and an empty slot, the fox may choose
to capture that hen by jumping over it. The hens can also capture the foxes by
positioning themselves such that the foxes are unable to move.
The hens win if they manage to fill the whole coop or remove both foxes
and the foxes win if they manage to deny the hens to do so, meaning only 8
Background | 7
hens remain.
Chapter 3
Method
As seen in fig. 3.1, we use yellow circles to represent hens, red circles
to represent foxes and a green circle to represent the selected position on the
board.
The algorithm works by recursively calling itself to evaluate each possible
move the current player can make until it either reaches a position where one
Method | 9
player has won or it has looked at as many steps into the future as it was
provided in its depth parameter, where it then will assign the current position a
score based on given heuristics, see section 3.2.1. The algorithm then chooses
the best move in regards to which side it is controlling, whilst assuming the
opponent will play the most optimal moves for them in the upcoming turns.
3.2.1 Heuristics
Each board state is given a score based upon certain heuristics. Since the hens
wants to maximize points, each time we award points to hens, we add that
number to the total score. Whilst on the other hand if points are awarded to
the foxes, said points are subtracted from the total. These heuristics should not
affect the results of this report, since we only analyze the performance of the
algorithms, not if it plays ”good”. The reasoning to have any type of heuristic
for this analysis is for the computer to be able to differentiate between moves.
The most important factor is if someone has won. Winning a game is
awarded one thousand points. For hens to win they need to fill up certain spots
on the board, we call these positions: hen’s coop . Three points are awarded
for each hen in the coop. Getting hens closer to their coop is also important,
therefore one point is awarded for moving a hen towards the coop. Hens are
also rewarded for staying closer to the edges.
Both players are rewarded for keeping their own pieces alive and capturing
the other player’s pieces. One point is awarded for each hen alive and ten points
for each fox alive.
Item Points
Hens alive 1
Foxes alive -10
Hen in coop 3
Hen on row: R R
Hen on column C ⌊|3.5 − C|⌋
3.2.2 Nodes
A node has information about a specific state of the game including the board.
Each node also contains a reference to all its own children, where a child is
a node based on a game board one valid move away from the current node’s
game board. See fig. 3.1.
10 | Method
3.3 Tests
This report conducted multiple tests on the minimax algorithm when used
on the Fox game in order to collect data and measure its performance. The
tests were conducted by simulating games and performing independent tests
on every different game state.
variant to find the optimal move was measured with Pythons time module [13].
The tested variants were; depths one to four, with and without α-β pruning
and with sequential and parallel implementations. The results are displayed in
fig. 4.5 to fig. 4.8.
3.4 Hardware
All tests were run on an AMD Ryzen 5 7600X CPU, with 6 cores and 12
threads, boosting to 5.4GHz [14]. Since most of the tests only measure the total
amount of computation needed, the hardware used does not matter. However,
the test that measured time, section 3.3.2, is highly dependent on the hardware
and different results might be found if run on other hardware.
12 | Results and Analysis
Chapter 4
This chapter presents data acquired from tests conducted on the minimax
algorithm, highlighting various features of the algorithm. section 4.1 presents
data collected from a single game, with 91 game states, highlighting the
computation and time required for the minimax algorithm to find the optimal
move for different depths. section 4.2 presents data collected from 9 games,
with a total of 888 game states, showing the changes in optimal move selection
rate between different depths.
level of depth. This is due to α-β pruning being able to prune of greater parts
of the tree since the tree is bigger, and therefore the efficiency of α-β pruning
is higher.
Figure 4.5 shows that the parallel versions are slower than the sequential
ones, this is not surprising as it is due to the extra overhead required when
Results and Analysis | 15
creating the multithreading environment. In fig. 4.6, fig. 4.7 and fig. 4.8, the
two parallel versions perform about the same as the sequential α-β pruned
version. Figure 4.8 does however show that the non pruned parallel version
is slower then the α-β pruned versions, indicating that for greater depths,
α-β pruning becomes efficient enough that more computing power can not
outperform it.
Table 4.1: Average probability of depth 1 to 4 choosing the same next move
as depth 2 to 5, (standard deviation).
Table 4.1 indicates that a higher depth results in a higher optimal move
pick rate. When comparing depth 1 to depth 5, the rate of the same next move
is around 40.8%, and for depth 2, it is 42.1%. However, for depth 3, this
percentage grows to 51.6%, and for depth 4, the percentage grows to 61.0%.
These observations suggest that although the growth is generally modest with
each added depth level, the increase becomes more pronounced when the
difference in the compared depths is smaller.
16 | Discussion
Chapter 5
Discussion
The standard version of the minimax algorithm exhibits greater node count
fluctuation due to its exhaustive exploration of all possible branches and
nodes, which can vary greatly in complexity from move to move. The α-β
pruned version on the other hand maintains a more constant node count, as it
consistently eliminates unnecessary branches, ensuring a more efficient and
stable search pattern throughout the game.
5.1.2 Parallelization
The implementation of parallelization substantially boosted the efficiency of
the standard minimax algorithm. As depicted in fig. 4.8, the parallelized
variant demonstrated a three to four-fold increase in speed at depth 4
when compared to the sequential version. This significant leap in speed
can be attributed to the concurrent evaluation of nodes in the game tree,
thereby exploiting the inherent parallelism of the minimax algorithm. The
simultaneous exploration of multiple game states resulted in faster decision-
making, thus improving the overall performance.
On the contrary, parallelization did not significantly enhance the
performance of the α-β pruned variant of the algorithm. As reflected in
fig. 4.7 and fig. 4.8, the parallel and sequential versions of the α-β pruned
variant demonstrated similar performance. This is primarily because the α-β
pruning process is fundamentally sequential. The efficiency of pruning relies
on knowledge of prior nodes to decide whether to evaluate a specific branch or
not. Due to this sequential nature, the benefits of parallelization are somewhat
limited in the context of α-β pruning. Our results are somewhat limited by
the fact that we used local α and β values for each thread. Therefor each
thread prunes a subtree independently from all other threads, which is more
inefficient than using synchronized α and β values. We do however believe
that global α and β values would not improve performance significantly since
the nature of α-β pruning is still very sequential and thereby limits the potential
of parallelism.
While parallelization led to improvements, it also introduced some
overhead costs due to the requirement of setting up threads for parallel
computation. As demonstrated in fig. 4.5, the parallel versions of the algorithm
were slower than the sequential ones, due to the time penalty of setting up
threads. This extra time penalty is only notable at lower depths, where the
cost of setting up threads outpaces the benefits of parallel processing. This
overhead cost becomes negligible at higher depths, as the exponential increase
in the number of nodes effectively drowns out the thread setup time. This
18 | Discussion
Chapter 6
Conclusion
References
www.kth.se