The idea of making football “calculable” was thus born on the part of the author as early as his childhood. In the early 1980s, thanks to access to the mainframe computer, the Harris, at the Free University of Berlin plus the gradually acquired programming skills, it became more and more concrete. The first attempts at forecasting were made.
When his own PC was acquired in 1988, the project was continued at home. Just in time for the 1990 World Cup, the first executable programme was able to calculate odds for all possible events of the upcoming World Cup.
What were the basics, what was the programme based on, what could it do, what were the algorithms? Well, the first thing to do was to come up with sensible rankings based on what football is all about: goals. These are divided into goals scored and goals conceded. If you want to make a playing strength number out of these two values, then it has to be the expected goals to be scored versus the expected goals to be conceded.
The three resulting problems from this basic idea: a) how do you calculate the expected goals for an upcoming match, even if these two values are known? And b) how do you determine the values? Following this would be c) how does one maintain and change the values based on the results that then occur?
The bigger problems here are certainly the initialisation of the numbers and problem c), the speed of adjustment. Problem b) is a purely mathematical one, which was therefore easy to solve. There is a goal average of the event (the league or, as in the first case, the World Cup), this is included in the calculation and the values are packed into a handy but conclusive formula. More about this elsewhere, I’m sure.
So if you had match strengths for the teams in terms of expected goals scored and expected goals conceded (a measure of the offensive strength and defensive strength of both teams, so to speak), then you could calculate goal expectations FOR AN OPEN PARTY. Once you have calculated these, you can use them to determine probabilities for the possible match outcomes 1 – X – 2, which should first be done via a simulation.
But how do you get match strengths for the two teams? Well, one could even proceed in such a way that one initialises all teams with an average value and then, with the help of the correct update parameters (the problem described under c), how does one optimally adjust the playing strengths on the basis of the real results), lets the programme do its job until one perhaps ends up with reasonable values after a few years. To do this, one could even take the data from previous years, perhaps 10 years back, feed the programme with it and let it determine the best possible, practically usable values. This way would be feasible in any case, but for practical reasons another way was chosen.
Just to add: the idea of ranking the teams according to the goals scored and goals conceded over the years divided by the number of matches is the worse, an unsuitable way, strictly speaking. After all, the teams have all had very different opponents. Especially if you think of countries (realistic for the World Cup) that are far away and don’t play each other at all. Asian or African teams, for example, could not be ranked reasonably, at least not on the basis of these values.
For the World Cup, the classification was largely intuitive. It was a first attempt (which did not produce bad results at all), but the programme was gradually introduced into everyday league play. It is optimally designed for league play anyway, since you often have a few constants (teams) over the years and continue to play everyone against everyone reliably, even twice.
So: at some point you have good estimates for all teams (on these pages you would find that under “Spielstärkerangliste”) and you can adjust the teams well on the basis of the real results that occur, because these changes are indispensable, not as a toy to bring the ranking to life, but because things always and constantly change everywhere. Since what is at stake here are assumptions that are as realistic as possible – resulting in forecasts that are as good as possible — it has to happen.
Now you have goal expectations for the individual upcoming games. In the first version, the games were simulated by distributing the expected goals over the 90 minutes. There was a goal chance virtually every minute, one for each team (not unrealistic), which was converted with a chance of goal expectation team 1 divided by 90 for team 1 and with goal expectation team 2 divided by 90 for team 2.
However, since it had long been observed that the teams fight much less as long as there is a draw, the values for this score were dampened, but when this or that team was leading, they were increased accordingly. This resulted in the first quite good figures in the sense of a forecast.
Forecasts, by the way, should not be misunderstood. A forecast merely indicates the most realistic possible distribution of chances, but never a “prediction” that this or that player will win the match or that or that team will win a tournament or a championship.
For the long-term bets – i.e. the questions of who will be champion or who will be relegated or who will get into the Champions League – a simulation of ALL OUTCOMING PAIRINGS is carried out and this is repeated (currently for the figures available here there are always 5000 simulation runs) and then the final table is formed on the basis of these simulated results. Whoever then stands on 1 is champion for this run, whoever stands on 17 and 18 is directly relegated and so on.
First of all, this much about the programme used. You will find further short notes under the individual points, exactly where the numbers are published.