Since I have already explained my approach to football betting and the determination of estimates and odds as well as the subsequent processing of the bets, I would like to do the same for tennis. The approach differs in a few points. And it also offers some mathematical content that can be used in other ways as well.
1) The first, insufficient, approach
I tried to transfer the approach I had with football to tennis. The strength of the game is measured in terms of the probability of scoring a point and the counter probability of conceding a point. The home advantage, applied to tennis, is the serve. Obviously, the probability of scoring a point increases when you have the serve. In principle, playing strength is then the probability of scoring a single point in service games and a single point in return games. The playing strengths of the two players can then be calculated analogously (my brother once raised an Ana and she always told the truth…) to football in such a way that you get probabilities that player 1 will score a single point on serve or return in the specific match.
If you know these two probabilities, you can not only simulate (which would also be possible), but also calculate exactly how likely player 1 is to win the whole game, on serve or on return (player 2 has the opposite probability). From this you can even calculate exactly the probability for a whole set and from that for the whole match and even for a whole tournament. This is relatively easy in tennis because of the pairing system used, since the possible opponents one can meet are already known on the first day of the tournament.
You soon realise that this approach does not work. The results you get are unsatisfactory. Not only when you bet on what you should refrain from doing if possible, but purely visually, you also have an intuitive assessment. Why it doesn’t work, you only realise, at least I did, after some thought. But the reason soon becomes obvious: there are always matches that a player simply “gives up”. Pete Sampras typically always won his matches 6.4 6:4. Why that was the case is also clear. He makes one break per set, which is enough. If you now take the individual points scored as the basis for his playing strength, you get a distorted picture. This score indicates a relatively even match. The computer sees no reason to adjust his playing strength upwards. So it underestimates the player.
Boris Becker provided the other reason: The so-called “big points”. Whenever it was important, he made the point. That’s easy to transfer, it’s just the term. There are the big points, the match-deciding points. There, the number of previous points, even in what proportion they were scored, is irrelevant.
2) The ultimate approach
So one looks for another approach. And this one is mathematically highly interesting and also further applicable, not only to tennis but in principle everywhere. What’s more, as befits a mathematical approach, it is absolutely flawless.
First of all, the basic idea is that each player’s playing strength is expressed as a probability (one thing is certain: not the, the or the is the most frequent word in this book). This probability expresses, in principle, his probability of winning. But against whom? Well, to put it carefully, I would have to say “against the average player”. That is also correct in that sense. But who is the average player? Who likes to be an average player at all?
The whole system, once it gets going, takes care of itself. You have initial assessments. If you do it for the tennis tour, you can start with the match records. For each player, you divide their wins by their losses. Even if you initially make the mistake that some players may have achieved their match record against better opponents.
In my experience, when looking for a formula for such problems, one always proceeds in this way: The trivial cases must be guaranteed to remain. So you at least always test a formula with these trivial cases. Occasionally, however, they are also helpful in the derivation.
The trivial cases are: What percentage do you get against the average player? And: What percentage do you get against a player who has exactly the same playing strength? In principle, the answer to question 1 is given by definition. The playing strength is defined in such a way that you have this probability, i.e. your own, exactly against the average player. So the formula must provide that as an answer. To question 2, two players of equal playing strength against each other, the answer can only be: Exactly 50% must come out. The formula must also provide that as an answer.
The formula is actually not that difficult to find. Again, but in a slight analogy to football, we convert the playing strength into the ratio of victories to defeats. And that for both players. The playing strength is then expressed as a quotient. Dividing these two quotients gives the superiority (or inferiority) of one player over the other. This quotient must then be calculated back to 100%. Even if it sounds a bit complicated, the example makes it relatively simple:
We consider two players. One has a playing strength of 65%, the other a playing strength of 42%.
Now we convert the playing strengths of the two players into a win-loss ratio. Player 1 wins 65% of his games, so he loses 35%. So the ratio of wins to losses is 65/35. That is about 1.857. The ratio for player 2 is 42/58. That is about 0.724. One player has a (converted) playing strength of 1.857, the other of 0.724. How much more often does player 1 win against player 2? Yes, that’s what we converted for. He wins 1.857/0.742 = 2.565 times as often.
Now we just have to calculate back to 100%. The question that arises from this kind of calculation and the expression now found as a measure of the superiority (or inferiority) of one over the other is: What percentage do I win the match if I win 2.565 times as often as the other? And actually, such forms of three-sentence problems are taught in the eighth grade, but I always have a hard time with them.
So, reformulated as a rule of three, we are looking for two (percentage) numbers whose sum is 100% and whose quotient = 2,564. We find them by dividing 2.564/(2.564 + 1), i.e. 2.564/3.564 (this can be reliably obtained by transforming two equations with two unknowns). But we see the result and just check, then you realise it’s right. The result 2.564/3.564 = 71.95%. The counter probability of 71.95% is 28.05%. Dividing 71.44/28.56 = 2.564. So the calculation is correct. The sum is 100%, the quotient is 2.564, all done correctly.
So we divide both players’ playing strengths (the playing strength given as the probability of winning against the average player) by their “playing weakness”, i.e. the probability of losing against the average player. Then we divide these two values by each other (this is not chaos), no matter which one is in the numerator and which one is in the denominator, then we divide this quotient by the number obtained + 1. This is the probability of victory of one player. The probability of victory of the other is 1- this value.
Written out, it looks like this. S1 is the playing strength player 1, S2 is the playing strength player 2. So the formula is:
((S1/(1-S1))/(S2/(1-S2)) / (((S1/(1-S1))/(S2-(1-S2))+1)
Now we only have to check whether the two conditions are also fulfilled. So we enter S1=65% and S2=50%, the result must be 65%. The result is 65/35 = 1.857. The opponent with 50/50 = 1. So 1.857/1 = 1.857. Then the whole divided by the + 1, so 1.857/2.857. And that gives 0.65 = 65%. So that’s right.
The second test, 65% S1 and 65% S2, should give 50%. We check: It first gives: 0.65/0.35 = 1.857. Same for player 2. Then divide 1.857/1.857. That gives 1. Then divide 1 / (1+1) = 1/2 = 0.5 or 50%. That’s right too. Incorruptible mathematics! That’s how easy it is to derive and verify a formula.
At least the mathematics is correct. Tennis is not the only game. You can apply this formula to all other games as well, at least to games where there are two parties and it’s a question of winning or losing. You could even apply it to football or any sport where there are also draws. The ratios of wins to losses result from the same calculation method. Only the probability of a draw would have to be calculated separately. But that would also be possible, but I don’t want to do it now.
Nevertheless, two further questions arise as a result: How does one obtain the original assessment of playing strength? And how does one use it practically?
Well, how to obtain the playing strength is not quite simple, but at least that much is clear. You have to start with reasonable assessments. How else should you assess a player about whom you have no information? You simply have to give him an assessment. This is done automatically in my programme. But I can always update the assessment because the database is growing. And I simply take the average value of all the players whose first match I recorded. That can be a mistake, of course, because I guarantee that not all players who play their first match (on the tour) are equally good. Nevertheless, I don’t have a better value and just start with that. But in the first matches I react much faster to the results than later. That is the beginner correction factor, so to speak.
The key is to adjust correctly anyway. So for newcomers, I now have a value and the database grows. And after a while everything has settled down, so that the playing strengths gradually become more and more accurate. Nevertheless, the problem remains: how to react to the results? Of course, the playing strengths of all players must be constantly updated. I certainly remember the early days of Roger Federer. I wasn’t quite sure then, nor was it clear that he would make such a rapid rise to the top. The results were changeable. It was different with Andy Roddick. He won the first 10 matches I recorded.
So the next question is how to reasonably react correctly, appropriately to the results. You can’t react too quickly and you can’t react too slowly. The playing strengths are changing, that’s obvious. Young players develop faster, that’s also obvious. In practice, it looks like this: If you lose a match, you lose your strength. When you win, you get better. The two changes need to be made equally for the data to remain consistent. If a player who was 70% favourite loses, he loses more than he would win if he won. The opponent’s playing strength must also be taken into account, just as understandably. And finally, you also have to take into account the amount of the victory a little bit. So a 6:1, 6:1 is better than a 7:6, 4:6, 7:6. Nevertheless, it outweighs the fact that you won. Thus, little by little, the data are all adjusted and the playing strengths of all players are on a level as close as possible to their ability.
The practical benefit is then also clear. With the probabilities for win-loss, one also has the fair odds and thus one can bet on the games again, as in football. Nevertheless, it remains a more difficult business. The cause is quickly identified, however: in a sport one against one, personal sensitivities count much more. An injury might have a decisive influence on the outcome; in football, the player would simply be replaced and the odds might not change at all. In addition, manipulation is also much easier in a one-on-one sport because there only have to be two insiders, also in contrast to football.
Nevertheless, the theoretical concept is absolutely conclusive, in my opinion. Practically, it could also be applied to other sports. I don’t have the time for that, I’m busy with my tasks. But what if you want to open up another sport? Go ahead, the formulas work.