#### The predictability of football

A question that must be asked sooner or later goes like this: “Is football predictable?” Of course, the answer seems as good as clear, and is sometimes heard from experts – football experts, mind you. It can only be: “No.” And there is also a very clear reason for this: “How are you supposed to calculate that? It is impossible to find out whether a ball bounces here or there, whether this one goes in or that one hits the post. Let alone where the players will move in detail and how and when and where the referee will whistle what and let what go through. No, it’s impossible.”

Nevertheless, predictability seems to be at least a theme in many discussion groups. For: as soon as the Germans have (once again) been drawn into one of the (easiest) preliminary or qualifying groups, one thing is certain: “They can do it. They have to do it.” Or Bayern meet an (arbitrary) opponent in the Champions League in the knockout phase, then it’s called “solvable task” or “difficult draw”, in any case the odds keep you busy and often enough a clear favourite position is identified.

This means that everyone actually makes some kind of “calculation” in their head. Some more, some less, but at the same time the question arises what one could actually do with the knowledge, if one had it? For it is precisely the suspense of whether this or that person will win that makes you want to watch, isn’t it? It is precisely the suspense aspect of the uncertainty of the outcome of a football match (or any other match) that is seen as an essential aspect of attracting people to the stadium or to the television screen? If the winner was always certain, people would be certain about the outcome in that sense – but at the same time stadium attendance and viewing figures would probably both tend towards zero. Who would still want to watch that? No, such a vision of the future makes one shudder rather than hopeful.

So it looks like there are no reliable statements about the outcome, only assessments. And here we are already gradually getting to the crucial point. A “reliable” prediction would mean rating an event at 100%. Brazil will win against Schwarz-Weiß Essen. That is certain. Surely something must be certain? Tiger Woods will win an 18-hole golf match against Dirk Paulsen? Pete Sampras wins a tennis match over three winning sets against the same opponent?

This is exactly where it becomes clear who is a “serious” prophet. The one who predicts 100% for any event in the future – and, not to get too philosophical, let’s stay with sport; among other things, the question could be raised whether it is certain that time will continue to run — is unserious. There must always be some imponderability, some tiny probability that stands against it. A small remark: we are not talking about manipulation here.

As soon as we know that not every goal-scoring opportunity is automatically a goal, that a player from Schwarz-Weiß Essen plays with the same ball, that the goal is the same size, that he can shoot it forward and at the goal, and that it can also bounce in, be deflected or rush into the net through a goalkeeper’s mistake, or even through a shot that is not a Sunday shot but a shot of the century, we know that we have to allow for a counter-chance. Tiger Woods or Pete Sampras might sprain their arm or miss every shot. The ridicule of this statement is palpable. This is not about the famous “painting the devil on the wall”, it is simply about pointing out that you should never forecast 100%. How small the chance actually is that stands against it may be comparable to the chance that a monkey composes Beethoven by arbitrarily stringing together notes, or that the same monkey beats Kasparov in a game of chess, just on the basis of knowing the legal moves.

That the differentiation of chances from the order of 1:1000 (and smaller) is very difficult for every human being is immediately and readily apparent: they are so improbable that one does not bother with them. By the way, another very important distinction is whether we are talking about positive or negative events. The negative ones could confidently be called “fears”. You feel them almost permanently, you even take out insurance against them in order to have to deal with them as little as possible, but to now seriously set off, get on a plane or in a car and calculate a probability of actually reaching your destination? No, that would not only be superstition forbidding us to do that. There is a chance against it, we know that. But to calculate it exactly? No, we don’t do that and we couldn’t do it at all.

On the positive side, things look quite different. You calculate the chance of winning the lottery – well, it’s calculated for you – or you think about how you can combine a good chance with a big win in the lottery, you play bingo or get a lucky ticket at the fair. The chance is there, small, medium, a little bigger, you often don’t calculate but hope. But this is in contrast to the negative occurrence of an unwanted event.

If one meets a good old acquaintance fortunately and purely by chance, then one also has no basis for calculating the event, but one thinks it is so spectacular that one tells others: “Just imagine who I met today.” One doesn’t necessarily like to calculate these either, in general, but one likes to let this kind of fate come to one. You give chance a chance, welcome to it.

So dealing with small probabilities is very difficult, also estimating them or putting them in comparison. One should only know that they exist permanently. This also applies to sporting events. Certainly, the organiser of a game or a sport, of a sporting event, will make sure that all participants play at the same level if possible. One should be able to recognise favourites with pleasure, that is even a serious intention, which is almost inevitably fulfilled, but just think for a moment how many spectators it would attract if all participants had exactly the same chances. As an example: The world championship in throwing dice: Who throws the higher number? The whole thing in a knockout system? Who would want to watch that? Participate maybe, why not, but watch and pay admission? No, you wouldn’t. There must be differences in performance and the performances should generally be at the highest possible level. The underdogs may be a little weaker, but they should be able to offer the favourites a tough match in which the neutral spectator might see a real chance of the underdog prevailing – and it might even happen here and there.

However, the organiser is happy to accept — no, in some cases they are and have been deliberately introduced — if there are a few factors of luck. If the predictability becomes too great, the whole thing also loses a lot of its appeal. “No, I won’t go there. The same person always wins anyway.” Even if it doesn’t always (not 100%!) happen there either (Where? Chess, for example?).

More concretely and in terms of football, this means that football is of course not exactly predictable, but it is predictable to a certain extent. One has to concede an imponderability even with very high superiorities. In a football match, let’s stay with the German Bundesliga, you simply have to allow for higher, reasonable counter chances, even if Bayern is playing at home against Freiburg. That’s what experience teaches you. A Toto player may call such a game a “bank game” — because: which else should he take? – but that does not make it “safe”. The chances against the favourite are somewhere in the single-digit percentage range, up to 6% perhaps, that this surprise will happen. And that’s not even counting the draws.

To sum up: You know that nothing is certain, but you know that the outcomes are not purely random. Often you know a favourite, but there are also the typical “three-way games” where in principle it is completely open – equal chances — who will come out on top.

There are favourites and outsiders, sometimes higher favourites, sometimes more blatant outsiders, there are only slight tendencies, a small favourite therefore, there are just as much balanced games. All in all, you can sense certain differences, and that’s almost the beginning of the secret: The question: “Is football predictable?” cannot be answered with “yes” or “no”, but with a quantifiable statement: it can be “calculated” within a certain framework, whereby the statements obtained merely indicate different probabilities for the occurrence of the individual events.

The “calculability”, which apparently colloquially suggests that one can find out who wins, thus refers to determining probabilities that are as exact as possible. Here it’s 20%, here it’s 75%. For this or that event.

Now it is immediately obvious that the reader will not only sit up and take notice here, but will immediately object: “Haha, if you calculate probabilities, then it only comes out that he or she wins. I knew that beforehand too.” And above all, he adds, “I wouldn’t rely on that. What is 70% supposed to tell me anyway? Nah, an unreliable thing. Not usable any further. Comes or doesn’t come, I know, doesn’t fit into my world. Basta.”

If you look a little deeper and more closely, you will find the last and perhaps biggest problem: even if the 70% is correct, in this one case it will happen or it will not happen, the “victory of the favourite”.

After that, however, the game is not repeated or played again, so that something could be done statistically. The mathematician – even if it were only the amateur one – would put in the form: “With the one-time execution of a random experiment with indefinite probability, nothing at all can be determined and stated, let alone the quality of the statement checked.”

This is correct insofar as one cannot say anything about the actual probability of occurrence with a single execution. However, one can check many probability estimates of a single forecaster in the long run. So it is not the quality of a single prediction that can be checked, but the quality of the person who made it in the long term. As I said, in the long term.

There is a mathematical procedure for this, which is often presented here. To what extent this will find mathematical recognition – it has already been discussed scientifically – is currently an open question, especially since it is ultimately a question of what one can actually achieve with it. One has probability estimates on certain events that are fairly independent of each other (well, sure, dependent insofar as the same teams play again and again, perhaps also against each other, but always under different conditions) and can even check in the fairly long term whether they are contradictory or seem reasonable, check this with a mathematical, let’s better say “statistical” number.

Now a statistical number always has the same flaw: it also does not indicate anything definitive. The only comparison here is to determine the average body weight of the population and then take a test person who simply deviates from it, perhaps even considerably: the mathematician cannot do much more than shrug his shoulders. “Yes, this person deviates considerably from the population mean. The probability of this was low because the weight was outside the threefold standard deviation. But these things happen…”

So if you want to seriously test the value of these calculated and tested, verified as best you can, probabilities, you inevitably have to put them up for comparison in the betting market. Here there is the link between scientific work, which deals with the possible predictability of football, and practical usability. In this respect, the two topics of “the predictability of football” and “the betting market” should always be treated together.

There is an immediately obvious connection between the probability of an event occurring and the odds achieved on this event on the betting market. The betting market “knows” intuitively or however – explained later in the text, of course – that there are more probable and less probable events. So the connection is like this: the lower the probability of occurrence, the higher the odds. The mathematical connection is also explained – which is quite simple – and a possible (in practice successful) winning strategy is presented.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

EXCURSUS APPLICABLE ELSEWHERE

Before taking this idea further, it should first be pointed out that events are about occurrence or non-occurrence. This, however, would only cover the two-valued. It comes or it does not come. In a football match alone, there are already three different outcomes. The sum of all the chances of the outcomes of a random experiment must always add up to 1, which means something like: “I don’t know how it will turn out, only that it will turn out”. Whereby, in a sense, it would have to be proven whether football games are random experiments. And even that involves the little trap of proving here precisely — contrary to the question discussed, “Is football predictable?”, with the apparently hoped-for but, if so answered, certainly unacceptable answer, “Yes, it is.” — that it is a particular kind of random experiment in which the various outcomes are not equally probable (this in contrast to the frequently cited dice example where, when thrown with a correct die, the six outcomes are considered equally probable), indeterminate, and yet (within a certain range) determinable.

In the case of events with several different outcomes, however, one could predict each individual outcome in terms of its probability, but also combine some, several together. Example here: Whether the game ends 0:0 or 1:1 would not matter in the sense of scoring points. Both are called draws. A 1:0 and a 4:2 differ somewhat more, since goal difference often plays a role. However, both are victories for team 1. These are summaries. You can also pick one at a time and examine that: “Will the game end 4:0?” and hold against it the chance “Will the game NOT end 4:0.” So you can determine very different probabilities of occurrence, combine some again, you can always reduce it to two (will the one to be examined come or will it not?) or leave all of them in their entire complexity. It should be noted that the sum of all possible ones in the event space that one considers for this investigation is 1. This always means: “One of the specified events will occur.” Unless one has knowledge of a remainder or wishes to consider it individually, it falls under “all other possibilities,” “the entire remainder.”

After this little digression: the continuation of the previous thought should be even easier. In football, the focus should first be placed on the three possible outcomes of a game, which add up to 100% from win – draw – defeat. If one had no idea about favourites and underdogs and not even about how many goals could be scored, perhaps could not even follow the game to be judged, could not watch it oneself, then one would only be able to say in the absence of an alternative: 1/3 of the time team 1 wins, 1/3 of the time the game ends in a draw, 1/3 of the time team 1 loses. This would be comparable to the prediction in a pure LaPlace experiment. All outcomes are equally likely.

Certainly, if one commits oneself further to this task and even waits for some statistics or is allowed to use old ones, one could already work out that in Germany’s First League, for example, the draw was “only” 29% for many years. That would already change the forecast to a 35.5% win, 29% draw, 35.5% defeat, on a purely statistical basis without any further knowledge.

The concept to be derived here is that of determination. How much can one commit to a favourite in such a forecast? Actually, it is more generally valid and more precisely asked to put it this way: How much can I commit to a favourite event? Everyone senses that there is a favourite. Sometimes it is singled out in the preliminary coverage, but the extent to which there are differences is rarely mentioned. One favourite has 55%, the other 73% to win.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx