Comparing 'common sense', random, and statistically modeled predictions of the World Cup Finals Group Stages
We
received interest in our previous blog post in which we compared our
predictions for the 2018 FIFA World Cup Finals Group Stage with those of random
dice throws. In particular, readers
seemed interested in our concluding remarks in which we drew comparison to the
more high-tech attempts of some European academics (Groll et al., 2018) who research gained media attention, and of Goldman Sachs, the multinational
investment bank and financial services company, whose predictions were also widely reported by the Press.
In
our previous post we identified that an initial reading of their results and
ours suggested that despite the investments of time and technology that these
organizations had poured into their highly sophisticated
statistically-calculated predictions, neither predicted the failure of the
German team, and neither seemed much better at identifying the teams which progressed
to the next phase of the tournament.
Given
the interest in these claims we thought it worth digging a bit deeper to
provide another layer of analysis to test the proposition that our soccer
knowledge yielded results superior to those of Groll et al. and Goldman Sachs’ economists
and statistical modeling.
Comparison of Methods
First,
let’s compare our approaches.
Goldman Sachs, on page 1 of its report ‘2018 – The World Cup and Economics’ describes that “The core of the publication is the forecasting
model” and that they “augment the
typical team level data with player level characteristics” having expended
“hours of number crunching” involving “200,000 probability trees, and 1 million
simulations”. Their conclusion? “England meets Germany in the quarters, where
Germany wins; and Germany meets Brazil in the final, and Brazil prevails.”
“In this work, we compare three
different modeling approaches for the scores of soccer matches with regard to
their predictive performances based on all matches from the four previous FIFA
World Cups 2002 - 2014: Poisson regression models, random forests and ranking
methods. While the former two are based on the teams' covariate information,
the latter method estimates adequate ability parameters that reflect the
current strength of the teams best.”
But the conclusions are remarkably similar:
“the FIFA World Cup 2018 is simulated
repeatedly and winning probabilities are obtained for all teams. The model
slightly favors Spain before the defending champion Germany.”
Contrastingly, our 'practical common sense' approach was to augment our soccer
knowledge with a quick glance through the FIFA rankings, this year’s
Panini sticker album, and a pull-out from When Saturday Comes Magazine. We then filled out an excel spreadsheet of
the tournament’s fixtures which we downloaded from Excely.com (click here to access it for yourself)
Finally, the random dice approach was just that. Two dice rolled to predict the scores for
each team.
So, how do the results from these different approaches compare and which method proved most effective?
Findings
Unlike
our own fully transparent and honest approach, neither the Groll et al. (2018) paper nor the
Goldman Sachs reports include a breakdown of what they think the results of
each game are likely to be – instead they hide behind statistical
probabilities, no doubt intended to blind the reader with science and fudge the
issue so nobody can point out where they got things wrong.
But,
both reports do include predicted final group tables and so it is these with
which we can provide a comparison to our own predictions.
For
the purpose of simplicity and fairness we decide to run a couple of different
scenarios, one very basic (Analysis X), the other a bit more nuanced (Analysis Y).
Analysis X: Teams to Qualify for the Knockout Stage
In Analysis X, the most basic test, we examined how accurately each ‘participant’ predicted which teams would progress to the next ‘knockout stage’ of the World Cup Finals by finishing in one of the top two positions in their group. In this scenario we were not interested in whether or not the actual position (1st or 2nd) was correct, just how many of the teams actually did get out of the Group stage. We awarded 1 point for each team correctly predicted to qualify for the next round (2 qualifiers x 8 groups = a maximum of 16 points).
|
Alex
|
Kevin
|
Dice
|
Goldman Sachs
|
Groll et al.
|
Group
A
|
2
|
2
|
0
|
1
|
2
|
Group
B
|
2
|
2
|
1
|
2
|
2
|
Group
C
|
2
|
2
|
1
|
2
|
2
|
Group
D
|
2
|
1
|
1
|
2
|
2
|
Group
E
|
2
|
1
|
1
|
2
|
2
|
Group
F
|
1
|
1
|
1
|
1
|
1
|
Group
G
|
2
|
2
|
1
|
2
|
2
|
Group
H
|
1
|
1
|
1
|
1
|
1
|
|
|
|
|
|
|
Total
Points
|
14
|
12
|
7
|
13
|
14
|
|
|
|
|
|
|
Success
Rate
|
88%
|
75%
|
44%
|
81%
|
88%
|
| | | | | |
As the results show, in Analysis X there was not much difference between the Soccer Mad Boffins approach and that of Goldman Sachs or Groll et al., whose sophisticated application of statistics was fairly effective did when compared with our non-statistically modeled predictions did not seem to fully justify the additional effort and investment. Whilst Groll et al. did have the highest success rate (88%) this was equalled by Soccer Mad Boffins' Dr Alex G. Gillett. Meanwhile Goldman Sachs were 1 point behind and came third, whilst Dr Kevin D. Tennent was fourth, but only 2 correct predictions behind Gillett and Groll et al., with 12 points (75% accuracy).
The random factor of the dice was the least successful predictor, being just 44% accurate (7 correct predictions) although this is higher than we thought it might have been.
Analysis Y: Group Table Finishing Position Predictions
In Analysis Y we went a few steps further and explored how accurately each ‘participant’ predicted the correct final position in the Group Tables of every team in the tournament. We awarded 1 point for each team positioned correctly (4 teams per group x 8 groups = 32 teams/a maximum of 32 points).
|
Alex
|
Kevin
|
Dice
|
Goldman Sachs
|
Groll et al.
|
Group A
|
2
|
2
|
0
|
2
|
4
|
Group B
|
2
|
0
|
1
|
2
|
2
|
Group C
|
2
|
2
|
0
|
4
|
2
|
Group D
|
0
|
1
|
1
|
0
|
0
|
Group E
|
2
|
2
|
2
|
4
|
2
|
Group F
|
1
|
0
|
2
|
1
|
0
|
Group G
|
4
|
4
|
1
|
4
|
2
|
Group H
|
2
|
0
|
1
|
1
|
2
|
Total Points
|
15
|
11
|
8
|
18
|
14
|
Success Rate
|
47%
|
34%
|
25%
|
56%
|
44%
|
In the more complex scenario investigated in Analysis Y, none of the 'participants' were more than about half successful, reflecting the difficulty in predicting soccer. Whilst Goldman Sachs' method did prove most successful (56% accuracy, with 18 points) this was only 3 more correct predictions than Soccer Mad Boffins' Dr Alex G. Gillett who correctly predicted 15 teams' final Group Stage positions (47% accuracy) to beat Groll et al. by a single point (14 correct predictions = 44% accuracy).
Again, the dice were the least effective predictor, with a 25% accuracy of prediction, getting just 8 predictions correct, although this '1-in-4' success rate is still quite good and possibly better than some TV pundits!
Conclusion and Discussion
This article, much like the reports from Goldman Sachs and Groll et al. are as much for fun as anything else.
So what did we find out? Well, that is a good question. With reference to the provocative question we raise in the title of this article 'Human Brain or Advanced Statistical Modeling?'
Clearly there is a case for the application of sophisticated statistics and global economics type research techniques when trying to predict the World Cup, but is it worth the the time and effort?
We have shown that when simply predicting which teams will progress out of the Group Stages (a sort of 'each way bet') we were as effective as the researchers 'scientifically' employing economic theory and advanced statistical methods to large data sets. Remember, our own methods involved using our working knowledge of the sport and browsing FIFA rankings and a couple of other things all easily accessible simply by walking to your newsagent. The caveat is of course that we research and write about football as part of our jobs, but many people follow and consume the game just as closely.
When applied to the more nuanced scenario of exact finishing positions within each group, the Goldman Sachs analysis did yield more accurate predictions, but the analysis of Groll et al. was slightly less accurate than Soccer Mad Boffins' Dr Alex G. Gillett.
And let's not forget our previous blog post in which we showed that our predictions, like those of Goldman Sachs and Groll et al. failed to predict some of the 'shock' results - for example all but the 'random dice' predictions identified Germany as the eventual World Cup winners! In contrast, the dice predict a Mexico - Switzerland final with Switzerland ultimately lifting the Cup. So far more realistic than any of the informed predictions.
And there lies an important point. With the exception of the dice, the predictions of which teams would qualify from the group stages were quite impressive, ranging in accuracy from 75% to 88%. However, for the more sophisticated task of predicting actual finishing positions (which required a more accurate prediction of goals scored and conceded, and game outcomes) neither Soccer Mad Boffins, Goldman Sachs, nor Groll et al. did particularly well - accuracy ranged between 34% - 56%.
Statistical modeling can be very useful in certain contexts, but some things are still very difficult to predict in that way. Science shouldn't put too much faith in the sums alone. Perhaps one lesson that our 'experiment' highlights is that the rationale behind doing things 'more scientifically' with statistical models is often to reduce the number of variables in a situation and to remove human irrationality from decision-making. But in some contexts perhaps a more complex array of variables and the acceptance of irrationality is necessary.
Perhaps this is because statistical models tend to assume rational behavior on the part of actors and ignore the role of agency. To critique the methodology of the Goldman Sachs and Groll et al. papers, their approaches perhaps assume too much that managers have perfect information about the abilities of their players and will act rationally to maximize utility - that is, they will know their best teams and tactics and always put them on the field. It also assumes that players can themselves maximize their own ability and always play to the best of their ability.
These assumptions are problematic when it comes to modeling football, which is a complex social system which relies on the interaction of two teams of 11 players plus coaches and officials! Managers may also use tactics which appear superficially sub-optimal, such as Gareth Southgate's controversial decision to rest Harry Kane, but which are intended to allow for long-term outcomes - thus managers may not play their best team in every match. This forecasting football has similar dangers to forecasting other social systems where a high degree of agency is in play. Prediction techniques may be better suited to analyzing individual sports such as tennis or chess rather than team sports. (we encourage readers to see also the work of Kuper & Syzmanski who raise similar points about the unpredictability of soccer).
So, why do these analysis, and who takes them seriously?
Well as we already said it is a bit of fun, it presents an opportunity to test knowledge and methods, and of course it is a fairly effective way to achieve a bit of publicity - as we have reported we found the Goldman Sachs and Groll et al. documents via online news reports.
Perhaps the best advice then is the small-print at the bottom of the front cover of the Goldman Sachs report 'Investors should consider this report as only a single factor in making their investment decision'!
References
Gillett, A.G. and Tennent, K.D. (2018) 'World Cup Finals Group Stages are over...how were your predictions?', blogpost availiable online: <http://http://soccermadboffins.blogspot.com/2018/06/world-cup-finals-group-stages-are.html> [accessed 1st July 2018]
Groll, A., Ley, C., Schauberger, G., and van Eetvelde, H. (2018) 'Prediction of the FIFA World Cup 2018 – A random
forest approach with an emphasis on estimated team
ability parameters', arXiv.org Open Access, available online <https://arxiv.org/pdf/1806.03208.pdf> [accessed 1st July 2018]
Kuper, S., & Szymanski, S. (2012). Soccernomics: Why England Loses, Why Spain, Germany, and Brazil Win, and Why the US, Japan, Australia, Turkey and Even Iraq Are Destined to Become the Kings of the World’s Most Popular Sport. Nation Book: New York.
APPENDIX:
DATA ANALYSIS
Game X: 1
point for each team correctly predicted to qualify
GROUP
A
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Uruguay
|
Uruguay
|
Uruguay
|
Saudi Arabia
|
Uruguay
|
Uruguay
|
Russia
|
Russia
|
Russia
|
Egypt
|
Saudi Arabia
|
Russia
|
Saudi Arabia
|
Egypt
|
Egypt
|
Uruguay
|
Russia
|
Saudi Arabia
|
Egypt
|
Saudi Arabia
|
Saudi Arabia
|
Russia
|
Egypt
|
Egypt
|
GROUP
B
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Spain
|
Portugal
|
Portugal
|
Morocco
|
Portugal
|
Spain
|
Portugal
|
Spain
|
Spain
|
Portugal
|
Spain
|
Portugal
|
Iran
|
Iran
|
Morocco
|
Spain
|
Iran
|
Morocco
|
Morocco
|
Morocco
|
Iran
|
Iran
|
Morocco
|
Iran
|
GROUP
C
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
France
|
France
|
France
|
Denmark
|
France
|
France
|
Denmark
|
Denmark
|
Denmark
|
Australia
|
Denmark
|
Denmark
|
Peru
|
Australia
|
Australia
|
France
|
Peru
|
Australia
|
Australia
|
Peru
|
Peru
|
Peru
|
Australia
|
Peru
|
GROUP
D
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Croatia
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Croatia
|
Iceland
|
Nigeria
|
Croatia
|
Croatia
|
Nigeria
|
Iceland
|
Nigeria
|
Croatia
|
Iceland
|
Iceland
|
Iceland
|
Nigeria
|
Croatia
|
Iceland
|
Nigeria
|
Nigeria
|
POINTS
|
2
|
1
|
1
|
2
|
2
|
GROUP
E
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Brazil
|
Brazil
|
Brazil
|
Costa Rica
|
Brazil
|
Brazil
|
Switzerland
|
Switzerland
|
Serbia
|
Switzerland
|
Switzerland
|
Switzerland
|
Serbia
|
Costa Rica
|
Switzerland
|
Serbia
|
Serbia
|
Costa Rica
|
Costa Rica
|
Serbia
|
Costa Rica
|
Brazil
|
Costa Rica
|
Serbia
|
GROUP
F
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Sweden
|
Germany
|
Germany
|
Germany
|
Germany
|
Germany
|
Mexico
|
Mexico
|
Sweden
|
Mexico
|
Mexico
|
Sweden
|
South Korea
|
Sweden
|
Mexico
|
South Korea
|
Sweden
|
Mexico
|
Germany
|
South Korea
|
South Korea
|
Sweden
|
South Korea
|
South Korea
|
GROUP G
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Belgium
|
Belgium
|
Belgium
|
Tunisia
|
Belgium
|
Belgium
|
England
|
England
|
England
|
England
|
England
|
England
|
Tunisia
|
Tunisia
|
Tunisia
|
Belgium
|
Tunisia
|
Panama
|
Panama
|
Panama
|
Panama
|
Panama
|
Panama
|
Tunisia
|
GROUP H
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Colombia
|
Colombia
|
Senegal
|
Poland
|
Colombia
|
Colombia
|
Japan
|
Poland
|
Colombia
|
Japan
|
Poland
|
Poland
|
Senegal
|
Senegal
|
Poland
|
Colombia
|
Japan
|
Senegal
|
Poland
|
Japan
|
Japan
|
Senegal
|
Senegal
|
Japan
|
POINTS
|
1
|
1
|
1
|
1
|
1
|
|
|
|
|
|
|
Game Y: 1
point for each correctly predicted position in final table
GROUP
A
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Uruguay
|
Uruguay
|
Uruguay
|
Saudi Arabia
|
Uruguay
|
Uruguay
|
Russia
|
Russia
|
Russia
|
Egypt
|
Saudi Arabia
|
Russia
|
Saudi Arabia
|
Egypt
|
Egypt
|
Uruguay
|
Russia
|
Saudi Arabia
|
Egypt
|
Saudi Arabia
|
Saudi Arabia
|
Russia
|
Egypt
|
Egypt
|
GROUP
B
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Spain
|
Portugal
|
Portugal
|
Morocco
|
Portugal
|
Spain
|
Portugal
|
Spain
|
Spain
|
Portugal
|
Spain
|
Portugal
|
Iran
|
Iran
|
Morocco
|
Spain
|
Iran
|
Morocco
|
Morocco
|
Morocco
|
Iran
|
Iran
|
Morocco
|
Iran
|
GROUP
C
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
France
|
France
|
France
|
Denmark
|
France
|
France
|
Denmark
|
Denmark
|
Denmark
|
Australia
|
Denmark
|
Denmark
|
Peru
|
Australia
|
Australia
|
France
|
Peru
|
Australia
|
Australia
|
Peru
|
Peru
|
Peru
|
Australia
|
Peru
|
GROUP
D
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Croatia
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Argentina
|
Croatia
|
Iceland
|
Nigeria
|
Croatia
|
Croatia
|
Nigeria
|
Iceland
|
Nigeria
|
Croatia
|
Iceland
|
Iceland
|
Iceland
|
Nigeria
|
Croatia
|
Iceland
|
Nigeria
|
Nigeria
|
GROUP
E
Actual
Final Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Brazil
|
Brazil
|
Brazil
|
Costa Rica
|
Brazil
|
Brazil
|
Switzerland
|
Switzerland
|
Serbia
|
Switzerland
|
Switzerland
|
Switzerland
|
Serbia
|
Costa Rica
|
Switzerland
|
Serbia
|
Serbia
|
Costa Rica
|
Costa Rica
|
Serbia
|
Costa Rica
|
Brazil
|
Costa Rica
|
Serbia
|
GROUP
F
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Sweden
|
Germany
|
Germany
|
Germany
|
Germany
|
Germany
|
Mexico
|
Mexico
|
Sweden
|
Mexico
|
Mexico
|
Sweden
|
South Korea
|
Sweden
|
Mexico
|
South Korea
|
Sweden
|
Mexico
|
Germany
|
South Korea
|
South Korea
|
Sweden
|
South Korea
|
South Korea
|
GROUP G
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Belgium
|
Belgium
|
Belgium
|
Tunisia
|
Belgium
|
Belgium
|
England
|
England
|
England
|
England
|
England
|
England
|
Tunisia
|
Tunisia
|
Tunisia
|
Belgium
|
Tunisia
|
Panama
|
Panama
|
Panama
|
Panama
|
Panama
|
Panama
|
Tunisia
|
GROUP H
Actual Final
Table
|
AG
|
KT
|
Dice
|
Goldman Sachs
|
Groll et al
|
Colombia
|
Colombia
|
Senegal
|
Poland
|
Colombia
|
Colombia
|
Japan
|
Poland
|
Colombia
|
Japan
|
Poland
|
Poland
|
Senegal
|
Senegal
|
Poland
|
Colombia
|
Japan
|
Senegal
|
Poland
|
Japan
|
Japan
|
Senegal
|
Senegal
|
Japan
|
POINTS
|
2
|
0
|
1
|
1
|
2
|