A quantitative recap of the 2022 FIFA World Cup

How each team’s probability of reaching a particular stage has evolved over time
python
altair
visualisation
football
Published

December 19, 2022

As the 2022 FIFA World Cup comes to an end, it is interesting to look at quantitative data to see how each team’s probability of reaching a particular stage has evolved over time. As explained in my previous post, it is possible to back out the implied probability for each outcome from the odds quoted from the betting markets. The full dataset of probabilities for the 2022 world cup can be found here.

Probability of final position per team over time (animated)

Source: own computation, betting markets; dataset available here

In the chart above the somewhat unexpected semi-final entries of Marocco and Croatia stand out in particular.

Another way to visualise the dataset is to look at the probability of each team winning the overall tournament. The chart below shows that probability as a line chart:

Probability of winning the tournament per team over time (interactive)

Code
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import pandas as pd
import altair as alt

data = pd.read_csv('2022 World Cup - Probabilities.csv', parse_dates=['date'])
win = data.loc[data.stage == 'win', :]

order = list(win.loc[(win.date == win.date.min()), :].sort_values('p', ascending=False)['country'])

alt.Chart(win[(win.p != 0)]).mark_line().encode(
    x=alt.X('date', axis=alt.Axis(format="%d-%b", title="")),
    y=alt.Y('p:Q', axis=alt.Axis(title="Probability", format="%"), scale=alt.Scale(type="log")),
    color=alt.Color('country', legend=None, scale=alt.Scale(scheme='category20'), sort=order),
    tooltip=['country', 'date', alt.Tooltip('p', format=".2%")]
).properties(
    width=600
)

Source: own computation, betting markets; dataset available here

The gradual rise of Argentina and France stands out here as well as the sudden rise of Croatia and Marocco after their quarter-final wins.