1) A little insight into the mental life of a player
2) Introduce the problem of “comparability of predictions”.
So everything I have read or seen about this problem so far has unfortunately been very superficial and unhelpful. So you have studied all the previous chapters assiduously and learned absolutely nothing new. Never mind, then you are, one way or another, capable of fulfilling the basic profile of a profile player, er, I meant professional player. So in all games, you know at least what to do to play with an advantage. You are familiar with the ratio of probability of occurrence to payout ratio. Now we have to try to actually determine probabilities of occurrence. But now suppose we approach the problem so carefully. We try to note down the probabilities. But we don’t necessarily want to bet yet, first we want to check whether our estimates are good. Do you know how I started? I developed my computer programme, which made the predictions for me. Whereby the term prediction is not a prediction in the classical sense. My predictions are merely estimates of the probability of occurrence. The “true probability” is not known at all. Does God know it?
As we have already seen, we cannot do much with these estimates. There are some values, then the predicted event is carried out exactly once under the given conditions, and some result comes out. If we bet on it, then of course we have a measure. Do we have more money or less money than before? But is that also representative? Sure, if you bet on a long-term and regular basis, you can read something about the financial outcome.
Nevertheless, we may have predicted many events, but only bet on one. And were unlucky in the process, that’s for sure. So: is there a method with which we can check the quality of predictions in the long run? I can’t raise the tension that high, there is one, guessed right. And the most amazing thing is that it doesn’t yet exist in mathematics. But the reason is also obvious and has already been mentioned sufficiently: mathematicians give this whole subject a wide berth themselves. One cannot find a foothold. Nothing provable. And even with my method, that’s where it remains, I have to admit. But we have already seen above that even the statistician cannot manage to make a 100% statement. That is also the case here.
To illustrate this, I have chosen a somewhat simpler example. We try to predict the probability of rain on a certain day at a certain place. And I am sure you can confirm that this experiment is also only ever carried out once under the given conditions.
3) The quality of predictions
In the “prediction test” file, a number of independent events are predicted for their occurrence or probability of occurrence by a number of participants who have different qualities or characteristics. Since the events are independent of each other and have different probabilities, it is not easy to check the quality of the individual prediction. In a sense, one cannot test the quality of a participant’s prediction for a single event. One can only check the quality of the predictor itself over a longer period of time, but not of the individual prediction.
Let’s move on to the practical example that makes the matter clearer: You try to predict the probability that it will rain at a certain time in a certain place. Now, I am sure you will agree with me that this probability cannot be determined exactly, unlike in the LaPlace experiments so often used elsewhere (as indicated above, these do not exist in practice).
For our experiment, we nevertheless assume the ideal case that we actually know this probability in the individual case (let’s say it is God-given). We are still skating on thin ice, because even if we know it in an individual case, it would still be difficult to check how good the individual prediction was.
I would like to illustrate this with an example:
Certainly, the sum of the hits on a certain period of time comes to mind as an examination possibility. So, let’s assume that it rains on average on 10 days in a given month (i.e. 33% of the time). Then someone could make it easy for himself and write down the probability of 33% for each day and he would probably even get a good hit yield (should it actually occur on 10 days in this case, the prediction would even be perfect in this sense). In fact, however, the forecast for every single day would be rather bad to very bad. Why? Quite simply: the average probability of rain over the 30 days has a completely different distribution than 33-33-33-33 etc., namely, for example, this: 90-20-10-45-22-16-88-12-5-22 etc. These are therefore the probabilities that are still unknown but assumed to be fixed for this example. They add up to 1000% for the 30 days, i.e. to 10 rainy days. With an average estimate, one would commit a permanent error, which, however, would have no effect on the total. So how can I prove this error? Apart from that, the average derived from previous years does not have to occur in this month. So if you don’t know at all and only orientate yourself on these statistics, you are inevitably at a huge disadvantage compared to the person who actually forecasts the weather, i.e. looks at whether clouds and low-pressure areas are coming in and draws conclusions from that. So if the person who does it the last way described comes to the conclusion that it is a low-rainfall month and there will only be 7 days in his buzzer of forecasts, which is also justified, he naturally has even more of an advantage. But even if the 10 days should be correct, the average player, who therefore only forecasts the average, must make a demonstrable mistake.
For this purpose I have introduced the following, also illustrative, terms: The average expected probability (= the determination) and the probability of the event occurring. The average expected probability is a simple concept to calculate, but it may be a little more difficult to understand.
Here is an explanation: We are in the initial phase of forecasting. We are starting to keep records. We introduce four simple columns for our records, for example in Excel: Prediction in per cent that it will rain, Prediction in per cent it will not rain (this is the counter probability, i.e. 1-it will rain), in the third column we note whether it has rained. And in the fourth column we note the probability of the event that occurred (concretely: 1st day, 20% it rains, 80% it does not rain. It actually rains on this day, 20% is the probability of the event occurring in column 4. 2nd day: 60% it rains, 40% it does not rain, it rains again, 60% in column 4, 3rd day 30% it rains, 70% it does not rain, it does not rain, 70% in column 4). Now, we must not forget that we only have one value with which we want to check the quality of our prediction, that is the probability of the occurrence of an event that we ourselves have taken as a basis. We do not know the truth (i.e. the actual probability of occurrence). Nevertheless, or precisely because of this, I raise the question of what average value we now expect in the 4th column? What value would have to appear there if the assessment were correct? In the long term, of course.
Well, actually it is quite simple: On the first day we said 20-80, so we expect that 20% of the time a 20 will appear in column 4, and 80% an 80. This corresponds to the basics for calculating an expected value: 0.20.2 + 0.80.8 = 0.04+0.64 = 0.68.
To test the correctness of this number, I will use an illustrative LaPlace experiment: We roll the dice 600 times and it happens by chance that the 6 is actually rolled 100 times (which is nevertheless theoretically the most probable outcome of a multitude of possibilities). Now, for simplicity’s sake and out of touch with reality (in reality we don’t know this probability either but we probably have a good approximation) we actually wrote down 16.66% and 83.33% for occurrence and non-occurrence each time.
In column 4, our result is easy to calculate: 1000.1666 + 5000.8333 = 433.33. This is the sum. Divided by the number of attempts to calculate the average: 4333.33/600 = 0.7222. But what did we expect? Well, according to the above formula per line 0.16660.1666 + 0.83330.8333 = 1/61/6 + 5/65/6 = 26/36 = 0.7222. So the values are identical. 0. 7222 is therefore the average expected probability and if, knowing the truth, this knowledge should also prevail (which, according to certain mathematical laws, should be the case in the long run, i.e.: the relative frequency approaches the probability of occurrence as closely as desired), then we achieve dazzling results (the average expected probability is as close as desired to the average occurred probability),
So, not knowing the actual probability, we now obtain any results that are still to be interpreted. As a basis here we only have our assessment of the probabilities, but we do not know the real one. So if we predict 20-80, but in fact it is the other way round 80-20, we achieve the following: We assume that the average expected probability is 0.68, which is true even in this case. In fact, however, 80% of the time there will be a 20 in column 4, and 20% of the time an 80. We would therefore wrongly expect .68, but the connoisseur of W-ness would know that the expected value = 0.80.2 + 0.20.8 = 0.32, which has no effect on a single event. In the case of consecutive errors, we would therefore deviate considerably from our expected value in the long term. This is exactly what the “prediction test” attachment file expresses.
I would just like to briefly explain the term “fixing”: So if someone should say (which one of the participants in the experiment does, but his result is highly boring), I don’t know whether it will rain or not, this person “does not commit”, he says 50% yes, 50% no, so to speak. His expected value is also easy to calculate: 0.50.5 + 0.50.5 = 0.5. And he actually achieves this value. Colloquially, it is also immediately obvious: he says: it’s raining or it’s not raining and in fact: it’s raining or it’s not raining (reminds me of my dog Waldi, he obeys me to the letter. When I say “come here or not”, he comes here or not).
If the random experiment allows for a determination (which is not supposed to be the case with a coin toss, for example), i.e. the average expected probability is really greater than 50%, then with the minimum determination (i.e. none at all) one can achieve an apparently good result (50 expected, 50 occurred), but the person who makes the possible determination (let’s say it is 58%, which corresponds to 30-70, for example, i.e. 0. 30.3 + 0.70.7 = 0.58) and he reaches 58%, then he is obviously better than the 50-50 man. But even if he only reaches 57% or 59%, I would still interpret the result as “better” because he has come well close to the obvious possible fix. The aim of the game is to predict (or better: get involved in) a determination that is as close as possible to reality, and at the same time to get as close as possible to this value.
In the present experiment, a couple of predictors have taken the starting line:
Unfortunately, the first one is immediately the best, because he knows the probabilities (by the way, this is completely unrealistic, only meant for illustration). Surprisingly, however, this person does not hit the mark exactly either, but regularly has a deviation in the sum of the occurred – expected probability. The reason for this is that despite knowing the probabilities, it remains a random experiment.
The second participant is the clueless bore who takes no risks and simply says 50-50. The third tries to achieve a good result by always predicting 99% on the event that is favourite. Since this does not correspond to reality, he naturally has a large deviation expected-arrived average w-ness, but his average expected is nevertheless far above that of all the others. The size of the deviation, however, speaks volumes – he has committed himself too much.
The fourth is the biggest competitor for the perfect one. Although he regularly has a small deviation from reality, he hits it relatively well (the additional columns for this one only serve to determine the deviation, which had to be kept mathematically correct within a given range).
The last one now sets himself within a reasonable framework, but since he only “guesses” he also only achieves 50% with the average W-ness that has occurred,, like the bores. But he is the braggart who pretends to know something and yet has no idea. His result is even worse than that of the clueless bores, because the latter at least admits to not knowing anything (worse, because unlike the bores he achieves a deviation expected-arrived).