The “calculation” of a football match

As soon as the term “calculation” appears, especially in relation to a football match, the reader’s scepticism as to whether it makes sense to continue reading is likely to reach its peak. On the one hand, this would of course have something to do with the fact that as soon as one hears “calculate”, one prefers to run away anyway, on the other hand, because it is an established fact that there is “nothing to calculate”.

Well, in this respect, a very cautious, as simple as possible, but all the more illustrative way should be chosen here to bring the nevertheless existing possibility closer. First, however, a few preconditions must be met.

1) The goals

It may not be necessary to say too many words about the goal of the calculation. In any case, it is worth mentioning here except that it could be highly exciting and scientifically interesting to be able to use the results on the betting market – and the better the quality of the calculation, the better the financial results in the long run. Even if it should not be enough – and this illusion should not be created here — to feed on it, one can either enjoy a cheap and pleasant additional entertainment or even perform at par to slightly positive on the betting market.

2) What should be charged?

What one wants to “charge” should also be highly interesting in this context and must first be clarified. In any case, it must be mentioned here, even if it is a small disappointment, that it is by no means the possibly triggered but naive idea that the outcome of a game, the next world champion or the next German champion are to be predicted , “calculated”. Only the individual probabilities can be determined as well as possible in their distribution.

3) Is it possible to check the quality of the calculated numbers?

This immediately raises the question of whether, once simple probabilities have been determined, anything at all can be done with them in a scientific sense. For this, one would at least have to be able to offer a test method for the quality of the numbers. For with a probability, everyone immediately senses that it makes possible, concedes, the occurrence and the non-occurrence – even if it is a ratio of 90:10 or 99:1. It can still “go wrong” or, better still, come or not come. This is remarkable insofar as the one who would be “blamed” for this, i.e. the calculator, in the sense of “Hey, you said it would come 99% of the time. But it didn’t”, could simply shrug his shoulders and say: “Well, that one per cent just came. That’s why I wrote it down. So what?”

If one pursues it a little further – even later, in the clarifying text — one would have to realise that he should actually even say: “It’s lucky that it came. Because if the one percent never came, then it wouldn’t have had a whole percent at all, but much less. In this respect, the occurrence even confirms my prediction.” Yes, the shoulder shrugging will then probably change sides. “Huh?” For the reader it is either understandable – or he gratefully accepts (and subsequently reads) the reference to the later clarification.

In any case, a method exists to check these numbers. Jokingly, people always like to say that, of course, as soon as one enters the betting market with the numbers, there is a much simpler method of checking the quality of the numbers obtained: Counting the remaining money. If it is all there is, then one has a fairly reliable answer….

4) How to “calculate”?

Yes, in order to keep the mathematics in the background as long as possible – and to keep the reader “in the loop” – the method used should be well derived and explained without even mentioning numbers, let alone formulas.

Imagine, then, that you just want to simulate a football match. Naturally, logically, as “realistically” as possible – whatever and however that may be. The first consequence would be: one would have created a match result. A random one, an almost arbitrary one. Werder Bremen – VfB Stuttgart 2:1. Great. Is that how you want it to end? I could have thrown the dice! At least one should be able to confirm this much: “Yes, it could end like this.” (This would be different if it were now 13:8 here).

Yes, the first prerequisite that would have to be created would be to make the simulation “realistic”, but before that, if such a “function” exists, one should bear in mind the difference to reality: In reality, the game is played once under the given conditions. In contrast to throwing the dice, the fall of the roulette ball or the drawing of the lottery numbers, where one is quite sure to know the probabilities quite well, but in addition one can make statistics for verification or simply for fun or also out of doubt, because of the repeatability of the event. A very important aspect.

If you now consider this difference with the given, possible (and realistic) simulation of making a football match repeatable through this simulation – with exactly the same chance distributions every time it is carried out again, of course — then it soon becomes apparent what advantage you would have: If it were carried out 1000 times in this way, then surely it can be assumed that a reasonable ratio of wins for Werder, draws and wins for Stuttgart would emerge? The prerequisite remains. “As realistic as possible” it must be — and this requirement perhaps becomes the decisive hurdle. But first of all, if the function existed, if it were repeated 100 times, one would have a result in the form of relative frequencies (don’t get a fright!), which would provide a reasonable percentage estimate for the individual match outcomes.

Yes, not only that. One could even use it to read off the probabilities for the 2:1 victory, the 0:3 defeat or the (unpleasant) 0:0, since these outcomes would also have to occur with approximately the expected frequency, given a sufficiently high number of runs.

If you think one step further, that you can run this simulation for all matches in a season, it becomes clear that you can even simulate entire seasons – and by repeating the process, get reasonable estimates of how likely it is that all the experts will be right: Bayern will win the Cup. (A simulation just carried out on 10.9.2010, showed “only” 40.2% for this event; however, it was before the third matchday of the 2010/2011 season, when Bayern had already lost a game against Kaiserslautern, which of course costs a certain percentage of chances).

All in all, a simulation is the crucial approach. What you want to find out is clear. What one could do with it is also clear. How one could test it will be explained in more detail later, must be accepted here first (the money counting method was also a suggestion, wasn’t it?). How one would like to approach it is now also clear. Now one only has to overcome the hurdle of how to do it “as realistically as possible”, although it is admittedly of a decisive character.

In order to achieve a simulation that is “as realistic as possible”, further preconditions must be met. Another question is: How do you want to simulate in the first place? Well, it is quite simple in itself: The teams get a chance to score here and there and, according to the random principle, this is exploited from time to time. Now all that needs to be determined is how many chances each team gets and how big the chance is of converting a single one. In the version of the simulation presented here, it was assumed that virtually every possession of the ball represents a kind of chance to score a goal. Surely that is the case, because you would only have to take it from teammate to teammate and so on, all the way to the goal and then sink it? The chance is there, even if it is very small.

Another approach would be to create only “real” scoring opportunities, as it is counted in the sports magazine “Kicker”, quasi shots on goal with a certain frequency, which then, depending on the playing ability, lead to goals with given probabilities. The two approaches do not take much away from each other, since the one chosen is only about the size of the chance to turn a possession into a scoring opportunity. If this is sufficiently (realistically) small, then just as many goals will be scored as with the other method.

So the questions go like this: how many chances does each team get and how big is the chance to convert one? How are realistic conditions created? First of all, you can also reduce this to just the question of size. Although this part is not quite realistic in the sense of “this is how a football match goes”, it is largely the same in practical terms. The model used assumes that possession changes twice per minute and thus that both teams have their scoring chances equally often. Compared to reality, one would have to say that the favourite, the better team, often has more possession, but this would only be reflected in the frequency of the ball changes. Logical, because either one team has the ball or the other, so possession remains the same. Only the phases of possession would be longer on one side and shorter on the other. This would also not be a shortcoming by any means. It would only be unrealistic to assume the same frequency of possession changes when one team has a large superiority. Otherwise, the rest is simply represented by the amount of chances to win a possession. A heavy favourite has a high conversion percentage, the clear underdog a very low one.

Of course, another magic word now lies in the “parameters” to be determined, which determine the mutual sizes of the scoring chances for a specific match. Which parameters are responsible for this? It should be obvious that these are different for all teams. Every team has its playing strength. Only how do you set them up and how do you determine their values?

The first subdivision made here is that each team is assigned an offensive strength and a defensive strength in the form of a measurable number. The offensive strength of each team determines how well they convert their own chances, while the defensive strength determines how well they can thwart goal-scoring chances, so to speak. As soon as two teams meet, the offensive and defensive strengths are offset against each other using a simple but very logical algorithm. As a result, one would have “goal expectations” for the specific match. These goal expectations would directly determine how big the chances of conversion per possession are.

However, at least one parameter must be taken into account first. This parameter is so clear that it would never be possible without it. However, it adds an undesirable complexity to the following concrete calculation. Therefore, it is only mentioned here for the time being: It is the home advantage. It applies to every team, even if not in completely identical amounts. What is certain, however, is that the chances always shift in favour of the home team, which is completely independent of the basic playing strength.