Image of board game - NumPy and Board Games in Python

Using Python’s NumPy To Improve Your Board Game Strategy: Your Odds When Attacking in ‘Risk’

I first played the board game Risk during my doctoral studies. We occasionally stayed up all night playing this game. I hadn’t played it for many years, but I bought it “for the kids” this Christmas, so I got to play it again. And soon, I found myself wondering what the odds are for the various attack configurations in the game. I could have gone down the route of working out the stats. But life’s too short. So I wrote a Python script instead to bring NumPy and board games together!

In this article, you’ll:

  • Learn how to use NumPy arrays instead of for loops to repeat operations
  • Use two-dimensional NumPy arrays
  • Find out more about named tuples

You can refer to Chapter 8 in The Python Coding Book to read a more in-detail introduction to numerical Python using NumPy.

Attack Configurations in Risk

This article is not about the board game itself. If you’ve played Risk, you may recall some or all of the rules. If you haven’t, you won’t learn all the rules here! I want to focus on the attack part of the game. A player can attack another player’s territory with one, two, or three dice. The player being attacked can defend with either one or two dice. Each die represents a toy troop in a toy army.

Here’s a summary of the rules for how one player attacks another player’s territory:

  • The two players roll one set of dice each. The attacker rolls the attack dice set. They can choose to roll one, two, or three dice. The defender rolls the defence dice set. They can choose to roll one or two dice. The players need sufficient toy troops available, but I won’t focus on this aspect of the gameplay here. So, for the sake of this article, all the combinations mentioned above are available
  • The highest value from the set of attack dice rolled is matched to the higher value from the defence set. If the attacker’s die has a greater value than the defender’s die, the attacker wins the bout, and the defender loses a toy troop from their territory. If the defender’s die has a higher value or if the values are tied, the attacker loses a toy troop
  • If both the attacker and defender rolled more than one die, the second highest values from both sets are matched, and the same rules as above apply

Let’s consider a few scenarios to clarify these rules. Here’s the first one:

  • The attacker chooses to attack with three dice and rolls 2, 6, and 3
  • The defender defends with two dice and rolls 4 and 5

The attacker’s 6, the highest value the attacker got, beats the defender’s higher value of 5. Therefore, the defender loses one toy troop.

The defender’s second value, 4, is higher than the attacker’s next-best value of 3. Therefore the attacker loses one toy troop.

Both players lose one toy troop each in this attack.

Here’s another scenario:

  • Attacker rolls three dice: 5, 5, and 3
  • Defender rolls two dice: 2 and 1

The defender loses two toy troops in this attack since the first 5 beats the 2 and the second 5 beats the 1.

Another attack:

  • Attacker rolls two dice: 4 and 2
  • Defender rolls two dice: 4 and 4

The attacker loses two toy troops in this attack since the defender’s first 4 beats the attacker’s 4 (recall that the defender wins when there’s a tie) and the defender’s second 4 beats the attacker’s 2.

Writing Python Code To Work Out An Attack’s Winning Odds

You can simulate an attack in Risk using a Python program:

  • Generate one, two, or three random numbers between 1 and 6 for the attacker and sort them in descending order
  • Generate one or two random numbers between 1 and 6 for the defender and sort them in descending order
  • Compare the maximum values from each set and determine whether the attacker or defender won
  • Compare the second value from each set, if present, and determine whether the attacker or defender won

You can simulate several attacks and keep a tally of how many bouts the attacker won and how many the defender won. The attacker’s winning percentage is:

(number of attacker wins / total number of bouts) * 100

To get a reasonable estimate of the probability of winning in each attack scenario, you’d need to run many attacks for each attack configuration and work out the winning percentages.

There are six attack configurations:

Attacker attacks with Defender defends with
32
31
22
21
12
11

Let’s run each scenario 10 million times to get a reasonable estimate of the winning probabilities.

Using a for loop to simulate repeated attacks

One way of proceeding would be to use nested for loops. You can loop through each of the six attack configurations shown in the table above. For each of these configurations, run another for loop which repeats 10 million times to simulate a large number of attacks. Keep a tally of wins and losses, and you can work out the winning percentage for each scenario.

I’ll show this code in an appendix, but it’s not the version I want to focus on in this article. So I’ll go straight to the second option…

NumPy and Board Games

NumPy is the key library in the Python ecosystem for numerical and scientific applications. The name NumPy stands for Numerical Python. If you’ve not used NumPy before, you can install it using pip by typing the following in the terminal:

$ pip install numpy

If you’ve never used NumPy before, you can still continue reading this article, as I’ll introduce everything I’ll use. However, you can also read a more in-detail introduction to NumPy in Chapter 8: Numerical Python for Quantitative Applications Using NumPy.

The key data structure NumPy introduces is the ndarray. The name stands for N-dimensional array. Although you’ll see some similarities between ndarray and Python’s list, there are also many differences between the two data structures.

NumPy is particularly efficient when you need to perform the same operation on each element of the structure. When using lists, you’d need to use a for loop to go through each element of the list. However, this is not required with NumPy arrays because of vectorization. You can perform element by element operations using NumPy arrays very efficiently because of how NumPy is written and implemented.

This feature makes NumPy perfect for testing a large number of runs of a board game such as Risk.

Simulating Attacks in Risk

Each attack in Risk can have one of 6 configurations depending on how many dice the attacker and defender choose to use. The table earlier in this article shows these configurations. You could create a list of tuples with all the options:

options = [(x, y) for x in range(1, 4) for y in range(1, 3)]

print(options)

This code gives the following output:

[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]

The pairs match the six attack configurations shown in the table earlier. In each pair, the first number is the number of attackers (the number of dice the attacker is using), and the second number is the number of defenders.

Using named tuples to store the attack configurations

However, I’ll go further and use a named tuple instead of a standard tuple to create these options. This will improve readability and minimise the risk of making errors in the code if you confuse the number of attackers and defenders.

There is more than one way of creating named tuples in Python. In the Sunrise Animation article I published earlier this year, I focused on using namedtuple from the collections module. In this article, I’ll use the NamedTuple class in the typing module:

from typing import NamedTuple

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

print(options)

The output now shows the same pairs as earlier, but this time as a list of named tuples Attack rather than standard tuples (I’m displaying this as a multi-line output for clarity):

[Attack(n_attackers=1, n_defenders=1),
 Attack(n_attackers=1, n_defenders=2),
 Attack(n_attackers=2, n_defenders=1),
 Attack(n_attackers=2, n_defenders=2),
 Attack(n_attackers=3, n_defenders=1),
 Attack(n_attackers=3, n_defenders=2)]

A named tuple allows you to access the data using the attribute names. You can still use the normal tuple indexing if you wish:

from typing import NamedTuple

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

option = options[-1]
print(option[0], option[1])
print(option.n_attackers, option.n_defenders)

The last two lines containing print() functions are identical:

3 2
3 2

Exploring Each of The Attack Configurations

Now, you can loop through the options list and test each of the six attack options. You can start by creating dice rolls for both attacker and defender depending on how many dice they each roll. You’ll use NumPy arrays for these. If you’ve used NumPy and random numbers before, you’re probably familiar with NumPy’s own version of the randint() function. You’ll start by using this function, np.random.randint():

from typing import NamedTuple

import numpy as np

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

for option in options:
    print(option)

    attack = np.random.randint(1, 7, option.n_attackers)
    print(attack)

    defence = np.random.randint(1, 7, option.n_defenders)
    print(defence)

You import NumPy using the conventional alias np. You create NumPy arrays attack and defence with the same length as the number of dice used. The number of dice is the value in either option.n_attackers or option.n_defenders.

This code gives the following output:

Attack(n_attackers=1, n_defenders=1)
[6]
[5]
Attack(n_attackers=1, n_defenders=2)
[3]
[4 3]
Attack(n_attackers=2, n_defenders=1)
[4 4]
[6]
Attack(n_attackers=2, n_defenders=2)
[5 2]
[6 6]
Attack(n_attackers=3, n_defenders=1)
[2 3 5]
[1]
Attack(n_attackers=3, n_defenders=2)
[5 1 6]
[2 5]

Each ndarray generated has random numbers between 1 and 6 representing the numbers on the dice rolled. The for loop runs six times. Therefore, you can see six sections in this output, one for each attack configuration.

Using NumPy’s Newer Generator For Random Numbers

If you’re familiar with the built-in random module, you’ve likely used random.randint() often. Therefore, replacing this with np.random.randint() may seem natural to you.

However, you should bear in mind that there are several differences between the two. A key difference which trips many is that random.randint() includes the endpoint, whereas np.random.randint() does not. Therefore a dice roll is represented as random.randint(1, 6) if using the built-in random module but np.random.randint(1, 7) if using NumPy.

In any case, NumPy has a newer system for generating random numbers and np.random.randint() is now a legacy function. So let’s replace this with NumPy’s preferred way of creating random numbers, which is to create a Generator instance using np.default_rng() and then call its methods:

from typing import NamedTuple

import numpy as np

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(1, 7, option.n_attackers)
    print(attack)

    defence = rng.integers(1, 7, option.n_defenders)
    print(defence)

I’m usually reluctant to use abbreviations such as rng for variable names. However, I’ll use the same variable name used in the NumPy documentation on this occasion. The method integers() is similar to NumPy’s randint(). The first and second arguments are the limits of the range from which the random number is chosen. By default, the high end of the range is exclusive, which is why you had to use 7 for a dice roll.

The third argument is the size of the array created. You’ll read more about this later.

Sorting the dice in each set in descending order

As you read earlier, you’ll need to match the maximum value from the attacker’s dice rolls with the maximum value from the defender’s dice rolls, and so on. Therefore, we can sort the arrays containing the dice rolls in descending order.

NumPy’s ndarray has a sort() method you can use:

from typing import NamedTuple

import numpy as np

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(1, 7, option.n_attackers)
    attack.sort()
    print(attack)

    defence = rng.integers(1, 7, option.n_defenders)
    defence.sort()
    print(defence)

This code outputs arrays with values sorted in ascending order:

Attack(n_attackers=1, n_defenders=1)
[1]
[4]
Attack(n_attackers=1, n_defenders=2)
[5]
[1 5]
Attack(n_attackers=2, n_defenders=1)
[2 5]
[4]
Attack(n_attackers=2, n_defenders=2)
[1 2]
[1 6]
Attack(n_attackers=3, n_defenders=1)
[2 3 3]
[4]
Attack(n_attackers=3, n_defenders=2)
[2 4 6]
[5 5]

Next, you can reverse the numbers in the arrays using the flip() function. flip() is not a method in the np.ndarray class but a function which returns a new array:

from typing import NamedTuple

import numpy as np

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(1, 7, option.n_attackers)
    attack.sort()
    attack = np.flip(attack)
    print(attack)

    defence = rng.integers(1, 7, option.n_defenders)
    defence.sort()
    defence = np.flip(defence)
    print(defence)

All the arrays are now sorted from highest to lowest:

Attack(n_attackers=1, n_defenders=1)
[5]
[6]
Attack(n_attackers=1, n_defenders=2)
[1]
[5 3]
Attack(n_attackers=2, n_defenders=1)
[6 1]
[4]
Attack(n_attackers=2, n_defenders=2)
[4 2]
[5 3]
Attack(n_attackers=3, n_defenders=1)
[4 1 1]
[2]
Attack(n_attackers=3, n_defenders=2)
[5 2 1]
[3 1]

Mid-Article Reflection

Let’s see what we’ve achieved so far and where to go from here. You’ve created all six attack configurations as a list of named tuples. You create a set of dice rolls for the attacker and defender for each one of these configurations. All dice rolls are ordered in descending order.

Here are the things you still need to do:

  • Determine how many wins the attacker has in each attack
  • Simulate 10 million attacks for each configuration
  • Work out the winning percentage for each attack configuration

Let’s use this mid-article reflection to look at how you can simulate 10 million attacks for each configuration.

One way of doing this would be to add another for loop within the one you already wrote. This would iterate 10 million times and generate different dice rolls every time.

However, when using NumPy, we want to learn to think using a different mindset when it comes to iteration. Instead of relying on loops using for or while, we should consider using “batch-processing” techniques with arrays. This process is called vectorization, as we’re using arrays as vectors. Don’t worry if you’re not familiar with vectors. You don’t need to understand vectors to understand vectorization.

Currently, the arrays attack and defence, which you’re creating using the integers() method, are one-dimensional (1D) arrays. They’re a single row of numbers, similar to Python’s list. We can change these arrays to two-dimensional (2D) ones with 10 million rows and one, two, or three columns, depending on how many dice are rolled. The rows represent different sets of dice rolls.

Changing attack and defence from 1D arrays to 2D arrays will affect all subsequent operations on these arrays. Therefore, you’ll need to refactor your code once you switch from 1D to 2D arrays.

When should you make this switch? There’s no right or wrong answer to this question. This depends on your experience with NumPy and your style of exploration when building a Python program. We could have created these arrays as 2D arrays right at the start when we first created them. We could do so now that we’ve made some progress using 1D arrays, or we could leave the switch until the latest point we can.

I’ll pick the third of these options for this article. However, remember there’s no single correct path to writing any program!

Removing The Dice Rolls That Are Not Needed

In most of the attack configurations, some dice rolls are not required. Once you’ve sorted the dice rolls in descending order, you can remove some of the values from the end of the arrays. For example, if the attacker attacks with three dice and the defender defends with two, you can discard the lowest dice roll in the attacker’s set.

The next step is to remove these unused dice rolls from the attack and defence arrays, as required. You’re trimming the arrays to have the same length within each attack configuration. You can remove values from the end of the arrays since you’ve already sorted them in descending order, and it’s the lowest values you need to remove. You can trim the longer one of the attack or defence arrays to the length of the shorter one.

Next, you can compare the trimmed arrays using the comparison operator >:

from typing import NamedTuple

import numpy as np

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(1, 7, option.n_attackers)
    attack.sort()
    attack = np.flip(attack)
    print(attack)

    defence = rng.integers(1, 7, option.n_defenders)
    defence.sort()
    defence = np.flip(defence)
    print(defence)

    min_length = min(len(attack), len(defence))
    result = attack[:min_length] > defence[:min_length]

    print(result)
    print()

You only need to compare attack and defence up to min_length since any values after that are the dice rolls you’re discarding. For example, if the attacker is attacking with three dice and the defender defends with two, then only the first two elements of attack need to be compared to the two values of defence.

You compare the truncated arrays directly using the greater than operator > and assign the returned value to the variable name result. Here’s the output from this code:

Attack(n_attackers=1, n_defenders=1)
[3]
[6]
[False]

Attack(n_attackers=1, n_defenders=2)
[1]
[4 1]
[False]

Attack(n_attackers=2, n_defenders=1)
[5 1]
[1]
[ True]

Attack(n_attackers=2, n_defenders=2)
[6 6]
[5 2]
[ True  True]

Attack(n_attackers=3, n_defenders=1)
[5 2 2]
[2]
[ True]

Attack(n_attackers=3, n_defenders=2)
[6 5 1]
[5 5]
[ True False]

If you’re new to NumPy, you may find the output surprising. The variable result is not a Boolean. Instead, it’s an array of Boolean values. When using comparison operators with NumPy arrays, the comparison operator acts element by element. The first value of one array is compared with the first value of the other array, and so on.

Let’s go through all six attack configurations simulated above. Recall that your output will differ from the version shown here since you’re generating random values each time you run this code.

  1. Attacker: one die (3). Defender: one die (6). The attacker loses since 3 is smaller than 6. result shows a single Boolean value False
  2. Attacker: one die (1). Defender: two dice (4, 1). The defender’s higher number, 4, is larger than the attacker’s only roll, 1. Therefore, the attacker loses their only toy troop and result has a single False value
  3. Attacker: two dice (5, 1). Defender: one die (1). Only the attacker’s higher number is taken into account. Since 5 is greater than 1, the attacker wins this bout, and result has a single True value
  4. Attacker: two dice (6, 6). Defender: two dice (5, 2). This attack has two bouts since both attacker and defender roll two dice each. The attacker wins with both dice and therefore secures two wins out of two. result has two values and both are True
  5. Attacker: three dice (5, 2, 2). Defender: one die (2). There’s only one bout to consider. The attacker’s highest number, 5, is greater than the defender’s die roll, 2. Therefore, the attacker wins the bout, and result has one True value
  6. Attacker: three dice (6, 5, 1). Defender: two dice (5, 5). There are two bouts to consider. The attacker wins the first one since 6 beats 5. The second number for both attacker and defender is 5, which means that the defender wins the bout since defenders win ties. result shows that the attacker had a 50% success rate in this attack with one win and one loss

Repeating The Exercise 10 Million Times

You’ve got code that can simulate any attack using any of the six possible attack configurations. To get a reasonable estimate of the success rate for each type of attack, you’ll need to repeat these attacks many times and get an average of the winning percentage.

Before you repeat each attack option 10 million times, let’s start by repeating them five times each. This will allow you to observe the results as you make changes to ensure you refactor your code correctly. When you’re convinced your code is working fine for five repetitions, you can repeat 10 million times.

2D NumPy arrays

Let’s explore 2D arrays in the REPL/Python Console before refactoring the main script. Let’s start with the 1D array you have so far:

>>> import numpy as np
>>> rng = np.random.default_rng()

>>> attack = rng.integers(1, 7, 3)
>>> attack
array([2, 2, 4])

>>> attack.shape
(3,)
>>> attack.ndim
1

rng.integers(1, 7, 3) creates an array of size 3 with random numbers from 1 to 6. The third argument in rng.integers() is size. When size is an integer, as in this case, the array created is a 1D array of length size.

The shape of this array is (3,). This notation may seem a bit strange, but it will make more sense when you see the shape of a 2D array. You also confirm that attack is a 1D array by using the ndim attribute.

Let’s change the argument assigned to size from a single integer to a tuple of integers. I’ll also explicitly name the argument for clarity:

>>> attack = rng.integers(1, 7, size=(5, 3))
>>> attack
array([[5, 2, 3],
       [6, 6, 2],
       [6, 5, 2],
       [1, 3, 3],
       [2, 3, 5]])

>>> attack.shape
(5, 3)
>>> attack.ndim
2

The size argument is now the tuple (5, 3). This represents the shape of the array, which you confirm when you show the value of attack.shape.

The array is now a 2D array with 5 rows and 3 columns. All elements in the array are random numbers between 1 and 6.

Let’s get some quick practice with manipulating a 2D NumPy array. Let’s assume you want to get the second item in the first row:

>>> attack[0, 1]
2

You use the familiar square brackets to index a NumPy array. However, you now use two index values. The first one represents the row index, and the second is the column index. As these are indices, they start from 0 like all indices in Python. Here are a few more examples:

# Second row. Third column
>>> attack[1, 2]
2

# Fourth row. Last column
>>> attack[3, -1]
3

# All the rows. Second column
>>> attack[:, 1]
array([2, 6, 5, 3, 3])

# Fifth row. All the columns
>>> attack[4, :]
array([2, 3, 5])

You’re now ready to refactor your Python script.

Refactoring code to add multiple attacks

You can add a new variable, n_repeats, and set it to 5. Therefore, you can use n_repeats to create 2D attack and defence arrays. Since you’re changing the nature of these arrays, you can comment out the lines with operations on these arrays for now as you’ll need to check that each one still does what you expect it to do:

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    # attack.sort()
    # attack = np.flip(attack)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    # defence.sort()
    # defence = np.flip(defence)
    print(defence)

    # min_length = min(len(attack), len(defence))
    # result = attack[:min_length] > defence[:min_length]

    # print(result)
    print()

The output from this code shows the attack and defence arrays for each of the six attack configurations. Each array has five rows representing the number of attacks you’re simulating for each configuration:

Attack(n_attackers=1, n_defenders=1)
[[1]
 [5]
 [6]
 [3]
 [3]]
[[6]
 [1]
 [5]
 [4]
 [6]]

Attack(n_attackers=1, n_defenders=2)
[[2]
 [6]
 [2]
 [2]
 [5]]
[[4 5]
 [1 4]
 [4 3]
 [1 5]
 [2 4]]

Attack(n_attackers=2, n_defenders=1)
[[6 2]
 [6 3]
 [3 1]
 [6 3]
 [5 1]]
[[3]
 [5]
 [3]
 [2]
 [4]]

Attack(n_attackers=2, n_defenders=2)
[[4 4]
 [1 2]
 [2 4]
 [6 5]
 [3 4]]
[[2 6]
 [2 2]
 [5 5]
 [2 4]
 [2 3]]

Attack(n_attackers=3, n_defenders=1)
[[3 2 3]
 [2 3 3]
 [3 4 6]
 [4 4 3]
 [4 5 6]]
[[4]
 [6]
 [4]
 [2]
 [4]]

Attack(n_attackers=3, n_defenders=2)
[[5 4 4]
 [4 6 1]
 [1 2 4]
 [2 2 5]
 [1 4 3]]
[[2 5]
 [1 4]
 [5 6]
 [5 4]
 [6 4]]

Sort each row in descending order

At the moment, each set of dice rolls is unordered. You can use the sort() method again. However, will this sort along rows or columns? One option is to try it out and see what happens:

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    # attack = np.flip(attack)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    # defence = np.flip(defence)
    print(defence)

    # min_length = min(len(attack), len(defence))
    # result = attack[:min_length] > defence[:min_length]

    # print(result)
    print()

Here’s the output from this code:

Attack(n_attackers=1, n_defenders=1)
[[1]
 [6]
 [5]
 [3]
 [1]]
[[5]
 [6]
 [3]
 [4]
 [6]]

Attack(n_attackers=1, n_defenders=2)
[[3]
 [2]
 [6]
 [6]
 [5]]
[[2 5]
 [1 6]
 [2 3]
 [1 5]
 [1 2]]

Attack(n_attackers=2, n_defenders=1)
[[1 2]
 [1 3]
 [3 6]
 [5 6]
 [3 3]]
[[1]
 [5]
 [4]
 [3]
 [2]]

Attack(n_attackers=2, n_defenders=2)
[[4 4]
 [3 5]
 [2 5]
 [5 6]
 [3 6]]
[[2 6]
 [2 5]
 [4 5]
 [2 2]
 [2 6]]

Attack(n_attackers=3, n_defenders=1)
[[3 5 6]
 [1 5 6]
 [2 4 5]
 [4 4 6]
 [3 4 5]]
[[6]
 [5]
 [3]
 [1]
 [1]]

Attack(n_attackers=3, n_defenders=2)
[[3 5 5]
 [4 5 6]
 [1 2 4]
 [3 5 6]
 [1 2 4]]
[[2 4]
 [1 3]
 [1 2]
 [3 3]
 [1 5]]

There are sufficient rows and columns to let you figure out the answer. sort() has sorted the values in each row and not in each column. You can confirm this by reading the documentation for sort(), which states that by default, the last axis is used. You’ll recall that the shape of the array is (5, 3). The last axis is the one that represents how many columns there are. This is the same as the number of items in each row.

Next, you need to reverse the numbers in each row using flip(). You were lucky with sort() since the default behaviour worked perfectly for this situation. So why not try your luck again with flip() and run the same line you had earlier?

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence)
    print(defence)

    # min_length = min(len(attack), len(defence))
    # result = attack[:min_length] > defence[:min_length]

    # print(result)
    print()

The output seems to show that this works as well since the rows are in descending order:

Attack(n_attackers=1, n_defenders=1)
[[3]
 [5]
 [4]
 [1]
 [2]]
[[2]
 [2]
 [6]
 [3]
 [1]]

Attack(n_attackers=1, n_defenders=2)
[[5]
 [4]
 [4]
 [3]
 [1]]
[[5 3]
 [6 1]
 [3 1]
 [3 1]
 [5 3]]

Attack(n_attackers=2, n_defenders=1)
[[6 5]
 [6 3]
 [6 1]
 [4 4]
 [6 1]]
[[4]
 [6]
 [1]
 [6]
 [3]]

Attack(n_attackers=2, n_defenders=2)
[[3 1]
 [6 3]
 [5 2]
 [5 1]
 [6 2]]
[[4 1]
 [6 4]
 [6 5]
 [6 3]
 [5 3]]

Attack(n_attackers=3, n_defenders=1)
[[5 2 2]
 [4 4 2]
 [5 4 1]
 [6 3 1]
 [6 3 1]]
[[2]
 [3]
 [2]
 [6]
 [1]]

Attack(n_attackers=3, n_defenders=2)
[[6 3 2]
 [6 2 1]
 [4 3 2]
 [6 6 2]
 [6 6 2]]
[[5 3]
 [5 3]
 [5 2]
 [6 4]
 [6 3]]

All the rows are in descending order. Can you move one? Let’s look at the documentation for flip() first. This time, the default doesn’t flip the last axis. Instead, it flips over all the axes! In this example, it doesn’t really matter, as the order of the five rows is unimportant. However, this shows us that it’s still important to read the documentation even if we think we’ve reverse-engineered how the function works using trial and error!

Let’s dive a bit deeper into what flip() is doing with more experimentation in the REPL/Console:

>>> import numpy as np
>>> rng = np.random.default_rng()

>>> attack = rng.integers(1, 7, size=(5, 3))
>>> attack
array([[3, 2, 1],
       [5, 1, 4],
       [4, 4, 2],
       [3, 4, 4],
       [1, 3, 1]])

>>> np.flip(attack)
array([[1, 3, 1],
       [4, 4, 3],
       [2, 4, 4],
       [4, 1, 5],
       [1, 2, 3]])

>>> np.flip(attack, axis=0)
array([[1, 3, 1],
       [3, 4, 4],
       [4, 4, 2],
       [5, 1, 4],
       [3, 2, 1]])

>>> np.flip(attack, axis=1)
array([[1, 2, 3],
       [4, 1, 5],
       [2, 4, 4],
       [4, 4, 3],
       [1, 3, 1]])

When you call np.flip(attack) with no additional argument, the original array is flipped along both axes: left to right and top to bottom. Note that you’re not reassigning the array returned by flip() to attack, so attack always keeps its original values in this REPL example.

In the second test, you add axis=0 as an argument to flip(). axis=0 represents the vertical axis, and you’ll note that the array returned is flipped along the vertical axis but not along the horizontal.

In the final call, the argument is axis=1, which means that flip() will act along the horizontal axis. This is the option you need in your code since there’s no need to flip along the vertical axis as well. Therefore, you can add axis=1 to your calls to flip():

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack, axis=1)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence, axis=1)
    print(defence)

    # min_length = min(len(attack), len(defence))
    # result = attack[:min_length] > defence[:min_length]

    # print(result)
    print()

Truncate each row using the minimum length

In the previous version of the code, when you had 1D arrays, you used the built-in len() function to find the lengths of the attack and defence arrays. However, since you’re now using 2D arrays, you can use the shape attribute and get its second value which represents how many items there are in each row. Recall that if you have an array with five rows and three columns, its shape is (5, 3).

Therefore, you can replace:

min_length = min(len(attack), len(defence))

with:

min_length = min(attack.shape[1], defence.shape[1])

However, attack and defence are now 2D arrays. Therefore, attack[:min_length] and defence[:min_length] are no longer the expressions you need to slice these arrays since those are 1D slices.

You need to keep all the rows in the arrays but truncate the columns. You can achieve this using the expressions attack[:, :min_length] and defence[:, :min_length]. The colon before the comma represents all the rows. After the comma in the square brackets, you write the slice :min_length. This slice represents all the elements in each row from the beginning up to the element with index min_length - 1.

Here’s the updated code:

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack, axis=1)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence, axis=1)
    print(defence)

    min_length = min(attack.shape[1], defence.shape[1])
    result = attack[:, :min_length] > defence[:, :min_length]

    print(result)
    print()

Here’s the output from this code:

Attack(n_attackers=1, n_defenders=1)
[[4]
 [2]
 [2]
 [4]
 [4]]
[[3]
 [1]
 [3]
 [4]
 [3]]
[[ True]
 [ True]
 [False]
 [False]
 [ True]]

Attack(n_attackers=1, n_defenders=2)
[[4]
 [6]
 [4]
 [5]
 [1]]
[[5 4]
 [4 1]
 [6 1]
 [5 1]
 [5 5]]
[[False]
 [ True]
 [False]
 [False]
 [False]]

Attack(n_attackers=2, n_defenders=1)
[[6 1]
 [4 4]
 [6 4]
 [5 2]
 [5 1]]
[[4]
 [3]
 [3]
 [5]
 [2]]
[[ True]
 [ True]
 [ True]
 [False]
 [ True]]

Attack(n_attackers=2, n_defenders=2)
[[3 2]
 [3 2]
 [4 2]
 [5 1]
 [4 3]]
[[5 3]
 [6 5]
 [5 4]
 [4 3]
 [6 4]]
[[False False]
 [False False]
 [False False]
 [ True False]
 [False False]]

Attack(n_attackers=3, n_defenders=1)
[[6 5 2]
 [6 6 4]
 [5 4 2]
 [3 2 1]
 [6 4 4]]
[[3]
 [5]
 [2]
 [3]
 [5]]
[[ True]
 [ True]
 [ True]
 [False]
 [ True]]

Attack(n_attackers=3, n_defenders=2)
[[6 4 4]
 [4 4 2]
 [4 4 2]
 [5 1 1]
 [6 4 3]]
[[5 3]
 [5 3]
 [5 2]
 [6 1]
 [5 4]]
[[ True  True]
 [False  True]
 [False  True]
 [False False]
 [ True False]]

Let’s look at the output for the final attack configuration with three attackers and two defenders. attack is an array with five rows and three columns. Each row represents a set of dice rolls. defence also has five rows but only two columns since each set of dice rolls includes two dice.

The final array displayed is result which compares the truncated attack array with defence. Therefore, only the first two elements in each row of attack are compared with defence. The values in result are the outcomes from each comparison, element by element. Let’s look at the five attacks for the configuration with three attackers and two defenders:

  1. In the first attack out of the five simulated, the attacker got 6, 4 and 4. The defender got 5 and 3. Therefore the attacker wins both bouts. The first row in result is [True, True]
  2. Second attack: Attacker has 4, 4, 2. Defender has 5, 3. The defender’s 5 beats the attacker’s first 4. However, the attacker’s second 4 beats the defender’s 3. The defender wins the first bout, but the attacker wins the second: [False, True]
  3. Third attack: Attacker has 4, 4, 2. Defender has 5, 2. Result is similar to second attack. One win and one loss for the attacker
  4. Fourth attack: Attacker has 5, 1, 1. Defender has 6, 1. The defender’s 6 beats the attacker’s 5, and the defender wins the first bout. The defender’s 1 ties with the attacker’s 1 which means the defender also wins the second bout. The corresponding row in result is [False, False]
  5. Fifth attack: Attacker has 6, 4, 3. Defender has 5, 4. The attacker wins the first bout, but the defender wins the second because the defender wins in case of a tie: [True, False]

Working Out Winning Percentage

When you play Risk, you need to know how many of your toy troops you’re likely to keep when you attack. Therefore, you can find the number of times you win a bout as a percentage of all the bouts.

The number of items in result represents the total number of bouts. The number of True values in result represents the number of times the attacker won a bout. You can use these two numbers to determine the winning percentage for all six attack configurations.

You can start by counting how many True values there are by using np.sum(). Recall that in Python, True has the value of 1:

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack, axis=1)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence, axis=1)
    print(defence)

    min_length = min(attack.shape[1], defence.shape[1])
    result = attack[:, :min_length] > defence[:, :min_length]

    print(result)
    print(np.sum(result))
    print()

I’m only showing the output for the final attack configuration below for compactness:

# truncated output
# ...

Attack(n_attackers=3, n_defenders=2)
[[4 4 3]
 [6 4 1]
 [6 4 4]
 [5 2 1]
 [3 3 2]]
[[1 1]
 [4 3]
 [6 5]
 [3 2]
 [6 1]]
[[ True  True]
 [ True  True]
 [False False]
 [ True False]
 [False  True]]
6

There are six True values in result. Therefore, np.sum() returns this value. You can also find the total number of items in result using the attribute result.size. Now, you’re ready to work out and display the winning percentages:

from typing import NamedTuple

import numpy as np

n_repeats = 5

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack, axis=1)
    print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence, axis=1)
    print(defence)

    min_length = min(attack.shape[1], defence.shape[1])
    result = attack[:, :min_length] > defence[:, :min_length]

    print(result)
    winning_percentage = np.sum(result) / result.size * 100
    print(f"Winning percentage for {option} is {winning_percentage:0.2f} %")
    print()

The output shows how many times the attacker wins as a percentage:

Attack(n_attackers=1, n_defenders=1)
[[6]
 [1]
 [5]
 [5]
 [5]]
[[5]
 [1]
 [6]
 [5]
 [2]]
[[ True]
 [False]
 [False]
 [False]
 [ True]]
Winning percentage for Attack(n_attackers=1, n_defenders=1) is 40.00 %

Attack(n_attackers=1, n_defenders=2)
[[6]
 [5]
 [3]
 [5]
 [4]]
[[6 1]
 [5 1]
 [5 1]
 [6 1]
 [6 1]]
[[False]
 [False]
 [False]
 [False]
 [False]]
Winning percentage for Attack(n_attackers=1, n_defenders=2) is 0.00 %

Attack(n_attackers=2, n_defenders=1)
[[4 3]
 [3 3]
 [5 5]
 [2 1]
 [6 3]]
[[6]
 [5]
 [2]
 [6]
 [3]]
[[False]
 [False]
 [ True]
 [False]
 [ True]]
Winning percentage for Attack(n_attackers=2, n_defenders=1) is 40.00 %

Attack(n_attackers=2, n_defenders=2)
[[6 3]
 [4 4]
 [6 6]
 [6 1]
 [5 1]]
[[2 1]
 [4 4]
 [3 3]
 [5 5]
 [3 1]]
[[ True  True]
 [False False]
 [ True  True]
 [ True False]
 [ True False]]
Winning percentage for Attack(n_attackers=2, n_defenders=2) is 60.00 %

Attack(n_attackers=3, n_defenders=1)
[[6 6 1]
 [6 5 1]
 [6 6 3]
 [5 4 2]
 [6 2 1]]
[[4]
 [1]
 [2]
 [5]
 [5]]
[[ True]
 [ True]
 [ True]
 [False]
 [ True]]
Winning percentage for Attack(n_attackers=3, n_defenders=1) is 80.00 %

Attack(n_attackers=3, n_defenders=2)
[[4 3 2]
 [4 4 1]
 [6 4 4]
 [6 3 2]
 [6 5 2]]
[[2 1]
 [6 3]
 [3 1]
 [5 4]
 [5 5]]
[[ True  True]
 [False  True]
 [ True  True]
 [ True False]
 [ True False]]
Winning percentage for Attack(n_attackers=3, n_defenders=2) is 70.00 %

Getting an estimate for the likelihood of winning

You’re ready for the final step. You can change from simulating 5 attacks for each configuration to simulating 10 million. However, you probably don’t want to display the arrays this time! Your final step is to change the value of n_repeats and comment out or delete the print() functions which display the arrays:

from typing import NamedTuple

import numpy as np

n_repeats = 10_000_000

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

rng = np.random.default_rng()

for option in options:
    print(option)

    attack = rng.integers(
        1, 7, size=(n_repeats, option.n_attackers)
    )
    attack.sort()
    attack = np.flip(attack, axis=1)
    # print(attack)

    defence = rng.integers(
        1, 7, size=(n_repeats, option.n_defenders)
    )
    defence.sort()
    defence = np.flip(defence, axis=1)
    # print(defence)

    min_length = min(attack.shape[1], defence.shape[1])
    result = attack[:, :min_length] > defence[:, :min_length]

    # print(result)
    winning_percentage = np.sum(result) / result.size * 100
    print(f"Winning percentage for {option} is {winning_percentage:0.2f} %")
    print()

Here’s the final output from this code:

Attack(n_attackers=1, n_defenders=1)
Winning percentage for Attack(n_attackers=1, n_defenders=1) is 41.68 %

Attack(n_attackers=1, n_defenders=2)
Winning percentage for Attack(n_attackers=1, n_defenders=2) is 25.47 %

Attack(n_attackers=2, n_defenders=1)
Winning percentage for Attack(n_attackers=2, n_defenders=1) is 57.88 %

Attack(n_attackers=2, n_defenders=2)
Winning percentage for Attack(n_attackers=2, n_defenders=2) is 38.98 %

Attack(n_attackers=3, n_defenders=1)
Winning percentage for Attack(n_attackers=3, n_defenders=1) is 65.98 %

Attack(n_attackers=3, n_defenders=2)
Winning percentage for Attack(n_attackers=3, n_defenders=2) is 53.96 %

When an attacker attacks with one die and the defender defends with one die, the defender has the advantage since ties are settled in favour of the defender.

It’s also clear you should never attack with one die when the defender is defending with two!

The most common type of attack in Risk (at least when I play the game) is three attackers against two defenders. Although the attacker does have the advantage in this case, it’s not as high as one might expect when rolling three dice against the defender’s two!

Final Words

You’re now better placed at playing and winning Risk as long as your opponents haven’t read this article as well!

More importantly, you’ve explored how to use NumPy arrays to perform “batch processing” using vectorization. This type of operation is what makes NumPy faster in many situations. I present a version of this code that does not use NumPy in the appendix and compare the time it took for the NumPy and non-NumPy versions to run. I won’t provide spoilers here, so glance at the appendix if you want to see the performance improvement when using NumPy.

Often, when people first learn about NumPy and start to use it, they think that all they need to do is replace lists with NumPy arrays, and the rest doesn’t change. However, the coding style changes when you want to use NumPy arrays effectively. Code that uses NumPy will look and feel very different from code that uses ‘vanilla’ Python.


Appendix: Using for Loops Instead Of NumPy

What if you don’t want to use NumPy. Of course, there are other non-NumPy solutions, too. Here’s one alternative which includes using for loops and doesn’t use NumPy. I’ll present the code without explanation here, but you’ll spot a lot of the logic we’ve already used in the main article:

import random
from typing import NamedTuple

n_repeats = 10_000_000

class Attack(NamedTuple):
    n_attackers: int
    n_defenders: int

options = [
    Attack(x, y) for x in range(1, 4) for y in range(1, 3)
]

for option in options:
    print(option)
    win_tally = 0
    min_length = min(option.n_attackers, option.n_defenders)

    for _ in range(n_repeats):
        attack = [
            random.randint(1, 6)
            for _ in range(option.n_attackers)
        ]
        attack.sort(reverse=True)
        # print(attack)

        defence = [
            random.randint(1, 6)
            for _ in range(option.n_defenders)
        ]
        defence.sort(reverse=True)
        # print(defence)

        result = [
            attack_value > defence_value
            for attack_value, defence_value in zip(
                attack[:min_length], defence[:min_length]
            )
        ]

        # print(result)
        win_tally += sum(result)
        # print()

    winning_percentage = (
        win_tally / (n_repeats * min_length) * 100
    )
    print(
        f"Winning percentage for {option} is {winning_percentage:0.2f} %"
    )

Here’s the output from the code, which gives the same values as the NumPy version, as you would expect:

Attack(n_attackers=1, n_defenders=1)
Winning percentage for Attack(n_attackers=1, n_defenders=1) is 41.68 %
Attack(n_attackers=1, n_defenders=2)
Winning percentage for Attack(n_attackers=1, n_defenders=2) is 25.44 %
Attack(n_attackers=2, n_defenders=1)
Winning percentage for Attack(n_attackers=2, n_defenders=1) is 57.85 %
Attack(n_attackers=2, n_defenders=2)
Winning percentage for Attack(n_attackers=2, n_defenders=2) is 38.97 %
Attack(n_attackers=3, n_defenders=1)
Winning percentage for Attack(n_attackers=3, n_defenders=1) is 65.96 %
Attack(n_attackers=3, n_defenders=2)
Winning percentage for Attack(n_attackers=3, n_defenders=2) is 53.98 %

If you’re relatively new to Python, this version may make more sense to you and fits the programming style you’re more used to. Programming in NumPy requires a different mindset to make use of the advantages of vectorization.

So let’s finish by quantifying the benefit of using NumPy in this example. It took 2.5 seconds to run the code using NumPy. It took the code in this appendix, which uses for loops instead of NumPy, 106.6 seconds using Python 3.11. The NumPy version is over 40 times faster for this scenario.

Further Reading


Get the latest blog updates

No spam promise. You’ll get an email when a new blog post is published


Leave a Reply