Soccer-Mad Boffins: World Cup Predictions: Human Brain, Advanced Statistical Modeling, or Completely Random?

Comparing 'common sense', random, and statistically modeled predictions of the World Cup Finals Group Stages

We received interest in our previous blog post in which we compared our predictions for the 2018 FIFA World Cup Finals Group Stage with those of random dice throws. In particular, readers seemed interested in our concluding remarks in which we drew comparison to the more high-tech attempts of some European academics (Groll et al., 2018) who research gained media attention, and of Goldman Sachs, the multinational investment bank and financial services company, whose predictions were also widely reported by the Press.

In our previous post we identified that an initial reading of their results and ours suggested that despite the investments of time and technology that these organizations had poured into their highly sophisticated statistically-calculated predictions, neither predicted the failure of the German team, and neither seemed much better at identifying the teams which progressed to the next phase of the tournament.

Given the interest in these claims we thought it worth digging a bit deeper to provide another layer of analysis to test the proposition that our soccer knowledge yielded results superior to those of Groll et al. and Goldman Sachs’ economists and statistical modeling.

Comparison of Methods

First, let’s compare our approaches.

Goldman Sachs, on page 1 of its report ‘2018 – The World Cup and Economics’ describes that “The core of the publication is the forecasting model” and that they “augment the typical team level data with player level characteristics” having expended “hours of number crunching” involving “200,000 probability trees, and 1 million simulations”. Their conclusion? “England meets Germany in the quarters, where Germany wins; and Germany meets Brazil in the final, and Brazil prevails.”

Meanwhile, the European academics (Groll et al, 2018) provide the following, more technical, information in the abstract to its paper, the somewhat drily titled ‘Prediction of the FIFA World Cup 2018 – A random forest approach with an emphasis on estimated team ability parameters’:

“In this work, we compare three different modeling approaches for the scores of soccer matches with regard to their predictive performances based on all matches from the four previous FIFA World Cups 2002 - 2014: Poisson regression models, random forests and ranking methods. While the former two are based on the teams' covariate information, the latter method estimates adequate ability parameters that reflect the current strength of the teams best.”

But the conclusions are remarkably similar:

“the FIFA World Cup 2018 is simulated repeatedly and winning probabilities are obtained for all teams. The model slightly favors Spain before the defending champion Germany.”

Contrastingly, our 'practical common sense' approach was to augment our soccer knowledge with a quick glance through the FIFA rankings, this year’s Panini sticker album, and a pull-out from When Saturday Comes Magazine. We then filled out an excel spreadsheet of the tournament’s fixtures which we downloaded from Excely.com (click here to access it for yourself)

Finally, the random dice approach was just that. Two dice rolled to predict the scores for each team.

So, how do the results from these different approaches compare and which method proved most effective?

Findings

Unlike our own fully transparent and honest approach, neither the Groll et al. (2018) paper nor the Goldman Sachs reports include a breakdown of what they think the results of each game are likely to be – instead they hide behind statistical probabilities, no doubt intended to blind the reader with science and fudge the issue so nobody can point out where they got things wrong.

But, both reports do include predicted final group tables and so it is these with which we can provide a comparison to our own predictions.

For the purpose of simplicity and fairness we decide to run a couple of different scenarios, one very basic (Analysis X), the other a bit more nuanced (Analysis Y).

Analysis X: Teams to Qualify for the Knockout Stage

In Analysis X, the most basic test, we examined how accurately each ‘participant’ predicted which teams would progress to the next ‘knockout stage’ of the World Cup Finals by finishing in one of the top two positions in their group. In this scenario we were not interested in whether or not the actual position (1^st or 2^nd) was correct, just how many of the teams actually did get out of the Group stage. We awarded 1 point for each team correctly predicted to qualify for the next round (2 qualifiers x 8 groups = a maximum of 16 points).

	Alex	Kevin	Dice	Goldman Sachs	Groll et al.
Group A	2	2	0	1	2
Group B	2	2	1	2	2
Group C	2	2	1	2	2
Group D	2	1	1	2	2
Group E	2	1	1	2	2
Group F	1	1	1	1	1
Group G	2	2	1	2	2
Group H	1	1	1	1	1

Total Points	14	12	7	13	14

Success Rate	88%	75%	44%	81%	88%

As the results show, in Analysis X there was not much difference between the Soccer Mad Boffins approach and that of Goldman Sachs or Groll et al., whose sophisticated application of statistics was fairly effective did when compared with our non-statistically modeled predictions did not seem to fully justify the additional effort and investment. Whilst Groll et al. did have the highest success rate (88%) this was equalled by Soccer Mad Boffins' Dr Alex G. Gillett. Meanwhile Goldman Sachs were 1 point behind and came third, whilst Dr Kevin D. Tennent was fourth, but only 2 correct predictions behind Gillett and Groll et al., with 12 points (75% accuracy).

The random factor of the dice was the least successful predictor, being just 44% accurate (7 correct predictions) although this is higher than we thought it might have been.

Analysis Y: Group Table Finishing Position Predictions

In Analysis Y we went a few steps further and explored how accurately each ‘participant’ predicted the correct final position in the Group Tables of every team in the tournament. We awarded 1 point for each team positioned correctly (4 teams per group x 8 groups = 32 teams/a maximum of 32 points).

	Alex	Kevin	Dice	Goldman Sachs	Groll et al.
Group A	2	2	0	2	4
Group B	2	0	1	2	2
Group C	2	2	0	4	2
Group D	0	1	1	0	0
Group E	2	2	2	4	2
Group F	1	0	2	1	0
Group G	4	4	1	4	2
Group H	2	0	1	1	2
Total Points	15	11	8	18	14
Success Rate	47%	34%	25%	56%	44%

In the more complex scenario investigated in Analysis Y, none of the 'participants' were more than about half successful, reflecting the difficulty in predicting soccer. Whilst Goldman Sachs' method did prove most successful (56% accuracy, with 18 points) this was only 3 more correct predictions than Soccer Mad Boffins' Dr Alex G. Gillett who correctly predicted 15 teams' final Group Stage positions (47% accuracy) to beat Groll et al. by a single point (14 correct predictions = 44% accuracy).

Again, the dice were the least effective predictor, with a 25% accuracy of prediction, getting just 8 predictions correct, although this '1-in-4' success rate is still quite good and possibly better than some TV pundits!

Conclusion and Discussion

This article, much like the reports from Goldman Sachs and Groll et al. are as much for fun as anything else.

So what did we find out? Well, that is a good question. With reference to the provocative question we raise in the title of this article 'Human Brain or Advanced Statistical Modeling?'
Clearly there is a case for the application of sophisticated statistics and global economics type research techniques when trying to predict the World Cup, but is it worth the the time and effort?

We have shown that when simply predicting which teams will progress out of the Group Stages (a sort of 'each way bet') we were as effective as the researchers 'scientifically' employing economic theory and advanced statistical methods to large data sets. Remember, our own methods involved using our working knowledge of the sport and browsing FIFA rankings and a couple of other things all easily accessible simply by walking to your newsagent. The caveat is of course that we research and write about football as part of our jobs, but many people follow and consume the game just as closely.

When applied to the more nuanced scenario of exact finishing positions within each group, the Goldman Sachs analysis did yield more accurate predictions, but the analysis of Groll et al. was slightly less accurate than Soccer Mad Boffins' Dr Alex G. Gillett.

And let's not forget our previous blog post in which we showed that our predictions, like those of Goldman Sachs and Groll et al. failed to predict some of the 'shock' results - for example all but the 'random dice' predictions identified Germany as the eventual World Cup winners! In contrast, the dice predict a Mexico - Switzerland final with Switzerland ultimately lifting the Cup. So far more realistic than any of the informed predictions.

And there lies an important point. With the exception of the dice, the predictions of which teams would qualify from the group stages were quite impressive, ranging in accuracy from 75% to 88%. However, for the more sophisticated task of predicting actual finishing positions (which required a more accurate prediction of goals scored and conceded, and game outcomes) neither Soccer Mad Boffins, Goldman Sachs, nor Groll et al. did particularly well - accuracy ranged between 34% - 56%.

Statistical modeling can be very useful in certain contexts, but some things are still very difficult to predict in that way. Science shouldn't put too much faith in the sums alone. Perhaps one lesson that our 'experiment' highlights is that the rationale behind doing things 'more scientifically' with statistical models is often to reduce the number of variables in a situation and to remove human irrationality from decision-making. But in some contexts perhaps a more complex array of variables and the acceptance of irrationality is necessary.

Perhaps this is because statistical models tend to assume rational behavior on the part of actors and ignore the role of agency. To critique the methodology of the Goldman Sachs and Groll et al. papers, their approaches perhaps assume too much that managers have perfect information about the abilities of their players and will act rationally to maximize utility - that is, they will know their best teams and tactics and always put them on the field. It also assumes that players can themselves maximize their own ability and always play to the best of their ability.

These assumptions are problematic when it comes to modeling football, which is a complex social system which relies on the interaction of two teams of 11 players plus coaches and officials! Managers may also use tactics which appear superficially sub-optimal, such as Gareth Southgate's controversial decision to rest Harry Kane, but which are intended to allow for long-term outcomes - thus managers may not play their best team in every match. This forecasting football has similar dangers to forecasting other social systems where a high degree of agency is in play. Prediction techniques may be better suited to analyzing individual sports such as tennis or chess rather than team sports. (we encourage readers to see also the work of Kuper & Syzmanski who raise similar points about the unpredictability of soccer).

So, why do these analysis, and who takes them seriously?

Well as we already said it is a bit of fun, it presents an opportunity to test knowledge and methods, and of course it is a fairly effective way to achieve a bit of publicity - as we have reported we found the Goldman Sachs and Groll et al. documents via online news reports.

Perhaps the best advice then is the small-print at the bottom of the front cover of the Goldman Sachs report 'Investors should consider this report as only a single factor in making their investment decision'!

References

Gillett, A.G. and Tennent, K.D. (2018) 'World Cup Finals Group Stages are over...how were your predictions?', blogpost availiable online: <http://http://soccermadboffins.blogspot.com/2018/06/world-cup-finals-group-stages-are.html> [accessed 1st July 2018]

Goldman Sachs (2018) '2018: The World Cup and Economics', research report prepared by the Goldman Sachs Global Macro Research Department, available online: <http://www.goldmansachs.com/our-thinking/pages/world-cup-2018/multimedia/report.pdf> [accessed 1st July 2018]

Groll, A., Ley, C., Schauberger, G., and van Eetvelde, H. (2018) 'Prediction of the FIFA World Cup 2018 – A random forest approach with an emphasis on estimated team ability parameters', arXiv.org Open Access, available online <https://arxiv.org/pdf/1806.03208.pdf> [accessed 1st July 2018]

Kuper, S., & Szymanski, S. (2012). Soccernomics: Why England Loses, Why Spain, Germany, and Brazil Win, and Why the US, Japan, Australia, Turkey and Even Iraq Are Destined to Become the Kings of the World’s Most Popular Sport. Nation Book: New York.

APPENDIX: DATA ANALYSIS

Game X: 1 point for each team correctly predicted to qualify

GROUP A

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Uruguay	Uruguay	Uruguay	Saudi Arabia	Uruguay	Uruguay
Russia	Russia	Russia	Egypt	Saudi Arabia	Russia
Saudi Arabia	Egypt	Egypt	Uruguay	Russia	Saudi Arabia
Egypt	Saudi Arabia	Saudi Arabia	Russia	Egypt	Egypt

POINTS

GROUP B

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Spain	Portugal	Portugal	Morocco	Portugal	Spain
Portugal	Spain	Spain	Portugal	Spain	Portugal
Iran	Iran	Morocco	Spain	Iran	Morocco
Morocco	Morocco	Iran	Iran	Morocco	Iran

POINTS

GROUP C

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
France	France	France	Denmark	France	France
Denmark	Denmark	Denmark	Australia	Denmark	Denmark
Peru	Australia	Australia	France	Peru	Australia
Australia	Peru	Peru	Peru	Australia	Peru

POINTS	2	2	1	2	2

GROUP D

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Croatia	Argentina	Argentina	Argentina	Argentina	Argentina
Argentina	Croatia	Iceland	Nigeria	Croatia	Croatia
Nigeria	Iceland	Nigeria	Croatia	Iceland	Iceland
Iceland	Nigeria	Croatia	Iceland	Nigeria	Nigeria
POINTS	2	1	1	2	2

GROUP E

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Brazil	Brazil	Brazil	Costa Rica	Brazil	Brazil
Switzerland	Switzerland	Serbia	Switzerland	Switzerland	Switzerland
Serbia	Costa Rica	Switzerland	Serbia	Serbia	Costa Rica
Costa Rica	Serbia	Costa Rica	Brazil	Costa Rica	Serbia

POINTS

GROUP F

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Sweden	Germany	Germany	Germany	Germany	Germany
Mexico	Mexico	Sweden	Mexico	Mexico	Sweden
South Korea	Sweden	Mexico	South Korea	Sweden	Mexico
Germany	South Korea	South Korea	Sweden	South Korea	South Korea

POINTS

GROUP G

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Belgium	Belgium	Belgium	Tunisia	Belgium	Belgium
England	England	England	England	England	England
Tunisia	Tunisia	Tunisia	Belgium	Tunisia	Panama
Panama	Panama	Panama	Panama	Panama	Tunisia

POINTS

GROUP H

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Colombia	Colombia	Senegal	Poland	Colombia	Colombia
Japan	Poland	Colombia	Japan	Poland	Poland
Senegal	Senegal	Poland	Colombia	Japan	Senegal
Poland	Japan	Japan	Senegal	Senegal	Japan
POINTS	1	1	1	1	1

Game Y: 1 point for each correctly predicted position in final table

GROUP A

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Uruguay	Uruguay	Uruguay	Saudi Arabia	Uruguay	Uruguay
Russia	Russia	Russia	Egypt	Saudi Arabia	Russia
Saudi Arabia	Egypt	Egypt	Uruguay	Russia	Saudi Arabia
Egypt	Saudi Arabia	Saudi Arabia	Russia	Egypt	Egypt

POINTS

GROUP B

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Spain	Portugal	Portugal	Morocco	Portugal	Spain
Portugal	Spain	Spain	Portugal	Spain	Portugal
Iran	Iran	Morocco	Spain	Iran	Morocco
Morocco	Morocco	Iran	Iran	Morocco	Iran

POINTS

GROUP C

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
France	France	France	Denmark	France	France
Denmark	Denmark	Denmark	Australia	Denmark	Denmark
Peru	Australia	Australia	France	Peru	Australia
Australia	Peru	Peru	Peru	Australia	Peru

POINTS

GROUP D

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Croatia	Argentina	Argentina	Argentina	Argentina	Argentina
Argentina	Croatia	Iceland	Nigeria	Croatia	Croatia
Nigeria	Iceland	Nigeria	Croatia	Iceland	Iceland
Iceland	Nigeria	Croatia	Iceland	Nigeria	Nigeria

POINTS

GROUP E

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Brazil	Brazil	Brazil	Costa Rica	Brazil	Brazil
Switzerland	Switzerland	Serbia	Switzerland	Switzerland	Switzerland
Serbia	Costa Rica	Switzerland	Serbia	Serbia	Costa Rica
Costa Rica	Serbia	Costa Rica	Brazil	Costa Rica	Serbia

POINTS

GROUP F

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Sweden	Germany	Germany	Germany	Germany	Germany
Mexico	Mexico	Sweden	Mexico	Mexico	Sweden
South Korea	Sweden	Mexico	South Korea	Sweden	Mexico
Germany	South Korea	South Korea	Sweden	South Korea	South Korea

POINTS

GROUP G

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Belgium	Belgium	Belgium	Tunisia	Belgium	Belgium
England	England	England	England	England	England
Tunisia	Tunisia	Tunisia	Belgium	Tunisia	Panama
Panama	Panama	Panama	Panama	Panama	Tunisia

POINTS

GROUP H

Actual Final Table	AG	KT	Dice	Goldman Sachs	Groll et al
Colombia	Colombia	Senegal	Poland	Colombia	Colombia
Japan	Poland	Colombia	Japan	Poland	Poland
Senegal	Senegal	Poland	Colombia	Japan	Senegal
Poland	Japan	Japan	Senegal	Senegal	Japan
POINTS	2	0	1	1	2

4 comments:

Mcqueen Amber 15 July 2018 at 21:43
Fifa Worldcup final won by France with the 4-2 score against Croatia in all important Fifa world cup 2018 Russia Final. Croatia played extremely well throughout the tournament but they remain unable to win final, this was the first time by when Croatia went into Final of Fifa Worldcup.
Unknown22 July 2018 at 16:00
PROMO BANDAR ONLINE WWW.MGMCASH88.COM :
-Bonus Depo Bola 50% (New Member)
-Bonus Cashback Bola 5% - 10%
-Bonus Depo Casino 3%
-Bonus Cashback Casino 5%
-Bonus Rollingan Casino 0.7%
-Bonus Depo Tangkas 5%
-Bonus Referal Bola 3% MenangKalah teman
-Bonus Referal Casino 1% MenangKalah teman

PROMO BONUS 5% LIVE CASINO
-SBOBET CASINO
-ION CASINO
-855 CASINO

Info lengkap hubungi live support custumer service 24 jam di :
PIN BBM : 7B2EC260
PIN BBM : D8796C4C
WHATSAAP : +66615620266
LINE : mgmcash88
judi online

Liril20 March 2019 at 01:55
Situs www PAITO4D net membantu peruntungan anda menjadi uang.
Kami menyediakan barang yang sangat menarik untuk anda gunakan.

TOGEL - SPORTSBOOK/BOLA - LIVECASINO - POKER - SABUNG AYAM
Dan masih banyak GAMES lainnya lagi.

Kami Menyediakan 6 Pasaran Togel :
@ Sydney Pools
@ Singapore Pools
@ Hongkong Pools
@ Magnum4d Pools
@ Netherland4d
@ Cyprus4d

Discount potongan Togel Hingga :
4D : 66%
3D : 59%
2D : 29%

BONUS yang kami sediakan saat ini sbg :
- 1% Bonus REFERRAL khusus TOGEL
- 10% Bonus DEPOSIT SPORTSBOOK & Games
- 15% Bonus CASHBACK SPORTSBOOK
- 0.8% Bonus ROLLINGAN Casino
- 0.3% Bonus CASHBACK Poker & Domino
- 5% Bonus CASHBACK Games & Sabung ayam

5 Bank disediakan untuk mempermudah transaksi transfer anda.
BCA - BNI - MANDIRI - BRI - DANAMON

Langsung saja ke LiveChat CS kami di -> www.Paito4D.net
Atau bisa juga hubungi kami di bawah ini :
PIN Bbm : PAITO4D
ID LINE : agpaito4d
Whatsapp: +855966186170
Terima kasih telah memberi kesempatan bergabung bersama kami Paito4D.
---------------------------------------------------------------------
Pemenangvipmandiriqq19 August 2020 at 11:06

VipmandiriQQ: Situs Judi Poker DominoQQ BandarQ Online Terpercaya
Selamat Datang di vipmandiri99.blogspot.com, Agen BandarQ, Domino Online, Poker Online, Terpercaya,
VipmandiriQQ Dengan Minimal Deposit Rp.15.000,-

Promo yang di sediakan Vip MandiriQQ saat ini :

♥ Bonus TurnOver 0.5% (Pembagian Setiap Hari Jam 12:00-15:00)
♣ Bonus Referral 10%+10% (Pembagian Setiap Hari Senin Jam 12:00-15:00)

Nikmati banyak keuntungan dan bonus Menarik lainnya Join Now!

Monday, 2 July 2018

World Cup Predictions: Human Brain, Advanced Statistical Modeling, or Completely Random?

Findings

Analysis Y: Group Table Finishing Position Predictions

Conclusion and Discussion

References

APPENDIX: DATA ANALYSIS

Game X: 1 point for each team correctly predicted to qualify GROUP A

4 comments:

Game X: 1 point for each team correctly predicted to qualify

GROUP A