Probabilistic Simulation

Monte Carlo simulations use statistical methods to generate random values for a mathematical model of a system. The idea is to run the model many times and see what sort of range of outcomes is observed. These simulations are useful for gaining insights into many real world systems that are too chaotic to predict reliably. They have uses in a variety of different fields such as the sciences, finance, and sports.

This page doesn’t explain how to make an actual Monte Carlo simulation, but it can give you an idea how to create a simulation that repeats many times using random inputs for a very simple mathematical model.

Example: Soccer Season

If a soccer team plays 12 games and they have a 0.600 probability (60%) to win each one, what are their chances to win at least 10 games?

As a start, here’s some code for simulating just one season of 12 games.

import random

for game in range(12):
    if random.random() < 0.600:
        print('W')
    else:
        print('L')

Notes:
1) random.random() is a function that returns a number from 0 to 1, not including 1. It’s useful for making something happen randomly a certain fraction of the time.
2) You need to call random.random() inside the loop to get a new random value for each iteration. If you instead did something like “x = random.random()” before the start of the loop, you’d be using the same random value over and over again.

Sample Output:

L
W
L
W
W
L
L
W
W
W
W
L

I can count those W’s to see that the team has a 7-5 record in that simulated season.

I can run it multiple times and see that sometimes my team wins more games, but also that sometimes the team has a losing season in spite of that 0.600 win probability.

I’d like the program to tell me how many wins the team has so I don’t have to count the W’s. This means I have to keep track of the wins as a variable:

wins = 0
for game in range(12):
    if random.random() < 0.600:
        wins += 1
print("Number of Wins:", wins)

Sample Output:

Number of Wins: 6

Notes:
1) The wins variable can’t be set to zero inside the for loop. That would erase the previous wins before every simulated game.
2) There’s no else branch this time, because losses don’t need to be stored or printed.
3) The print() command isn’t in the for loop’s code block. It’s after the for loop completes (study the indents above), because I only care about the win total after the 12 game season is complete.

Let’s run this simulation 1000 times!

You need nested for loops. The inner loop runs a 12 game season, and the outer loop runs 1000 seasons.

The outer loop is easy to add to the previous example:
1) type “for i in range(1000):” at the top
2) indent the lines of code that need to repeat
(highlight them and hit the “tab” key)

for i in range(1000):
    wins = 0
    for game in range(12):
        if random.random() < 0.600:
            wins += 1
    print("Number of Wins:", wins)

Sample Output:

Number of Wins: 6
Number of Wins: 8
Number of Wins: 4
Number of Wins: 6
Number of Wins: 7
Number of Wins: 10
Number of Wins: 8
Number of Wins: 9
Number of Wins: 7
Number of Wins: 10
# goes on for another 990 lines...

Note: don’t use the same stepper variable in both of the for loops.

This would be a lot nicer if the computer was keeping track of the number of times the team won at least 10 games. This can be recorded in a number of ways, depending on how much information you want saved.

Method 1: Counter variable

tenplus = 0
for i in range(1000):
    wins = 0
    for game in range(12):
        if random.random() < 0.600: wins += 1 
    if wins >= 10:
        tenplus += 1
print("Seasons with 10+ wins:", tenplus)

Sample Output:

Seasons with 10+ wins: 71

Fun side note: I ran this a bunch of times and noticed that I was getting results anywhere from 64 to 92, or 6.4% to 9.2% of the simulated seasons. I bumped up the number of trials to 100000 and found that my new range was about 8.3% to 8.4%. It makes statistical sense that the percentage ranges would have less random noise with a higher number of trials.

Method 2: List of all win totals

If I start wondering how often my team wins at least 11 games, or exactly 8 games, or how often it has a losing season, I will need to know all the win totals, not just the number of times the team won at least 10 games.

Saving each win total in a list is a quick way to have all of those answers available.

This time, I’ll set the program to find the 10+ win seasons as well as number of 12-0 seasons.

totals = []
for i in range(1000):
    wins = 0
    for game in range(12):
        if random.random() < 0.600:
            wins += 1
    totals += [wins]
win10 = totals.count(10)
win11 = totals.count(11)
win12 = totals.count(12)
tenplus = win10 + win11 + win12
print("Number of 10+ win seasons:", tenplus)
print("Number of 12-0 seasons:", win12)

Sample Output:

Number of 10+ win seasons: 83
Number of 12-0 seasons: 3

Notice the use of the list.count() function to see how many times the list contains a value of 10, 11, or 12.

It only takes two lines of code to iterate through the list and count how many times every number of wins happened:
(There are 13 possible outcomes, from 0 to 12)

for t in range(13):
   print(t, 'wins:', totals.count(t))

Method 3: Dictionary

A dictionary sometimes makes the lookup process a little quicker, and it stores your data more compactly in this case.

totals = {}
# Create all 13 possible entries (0 to 12)
for t in range(13):
    totals[t] = 0

for i in range(1000):
    wins = 0
    for game in range(12):
        if random.random() < 0.600:
            wins += 1
    totals[wins] += 1

for key in totals:
    print(key, 'wins:', totals[key])

Sample Output:

0 wins: 0
1 wins: 0
2 wins: 0
3 wins: 8
4 wins: 43
5 wins: 112
6 wins: 173
7 wins: 219
8 wins: 229
9 wins: 117
10 wins: 84
11 wins: 12
12 wins: 3

Notes:
1) Learn about dictionaries to make more sense of this example.
2) Notice that it only took two lines of code to iterate through the dictionary to print out the number of times each win total happened.


Adding Complexity

So far, this is a simple 10-12 line program.

Here are some ways to make it more interesting:

  • Give the team a random win percentage.
  • Choose the random win percentage using a normal distribution: the random module has a function named normalvariate() that can do this for you.
  • Create a league of n teams and calculate the win percentage for each team. Use loops to make every team play every other team enough times to complete their schedules. For example, in a 7 team league, each team can play each other team twice to get to 12 total games.
  • Give each team a number to represent their competitive strength. Figure out a way to use those numbers to calculate an adjusted win probability for each match, depending on the strength of the two teams. Stronger teams should have higher win probability against weaker teams, but two strong teams or two weak teams should have approximately even odds against each other.
  • Give each team a random chance of an event that changes their competitive strength starting in the middle of the season. This could represent things like injuries or a good player transferring to the team.
  • Create a playoff system where the top teams play in a tournament to determine a league champion. How often does the strongest team win the tournament?
  • What if winning makes a team more motivated and increases their competitive strength slightly? How would this change the results?