## Archive for November, 2012

## Aggregated Football Scores and Gambling

by admin on Nov.10, 2012, under Uncategorized

A few months ago my roommate won $500, but he only had to enter $20 to play the game. It was the first time he played, and so naturally all of the regulars were very upset. The game works like this: A 10×10 table is set up, with each row and each column corresponding to a digit 0-9. These digits represent the final digit in the score of the home team and the away team. One quarter of the money goes to the person who bought the square at the intersection of the last digits of the two team’s scores in the first half. Another quarter is paid out based on points only scored in the second half. The final half of the money goes to the person whose name was in the box which corresponds to the last digits of the final scores of the two teams.

*Example Scorecard*

If football scores were random, you could pick any box, cross your fingers, and have an expected payout of exactly 1 (no taxes or rake). But scores aren’t random. They are only added in discrete amount of 2, 3, 6, 7, and 8. Some boxes had to be better than others, and I needed to find out exactly how much better they can be.

By using the statistics at http://www.pro-football-reference.com and a combination of PERL scripts and Excel, I was able to come up with some neat conclusions. The data I used was from every pro game (including playoffs) from the 1974 season through November 10th, 2012. This is a huge dataset that includes over 18,000 final scores.

First I looked at overall scores. I was expecting to see some scores happen more frequently than others, but I was surprised at how extreme the effect was.

The highest peak is a score of 17, which is seen in almost 8% of final scores. Nearby peaks are 10, 20, and 24, all at about 6.5%. The double peak of 13 and 14 is also not a surprise. But I still hadn’t answered my question. I didn’t need to know about the final score, just the last digit of the final score. So I ran the same analysis across all 18,000 final scores to get a similar chart.

If the final digit were random you would expect the results to all be hovering around 10%. Obviously there is something else going on here because you can see the same oscillations from the first chart again in this chart. It appears that 0, 3, 4 and 7 occur more frequently than you would expect, and might give you an edge when you are betting. But how much of an edge? It gets more complicated when you combine a great number, like 7, with an average number, like 6. Is it still worth it? How worth it? To answer these questions I compiled a cheat sheet showing the expected return on investment for each of the 100 positions on the chart (note that the chart is mirrored across the diagonal because it shouldn’t matter which team is which).

So now we know that betting on 7 and 7 is the best one to choose, and betting on 2 and 2 is a good way to throw away your money. But in this game you can place multiple bets on the same board. How does that change the betting strategy? If you rank each box by expected value and keep betting on more and more boxes then you should expected your probability of winning to rise while your expected value shrinks, until they both equal one (when you bet on every square and win all of your own money back). This last chart illustrates the entire betting spectrum.

When making this chart I was also able to demonstrate (accidentally) the error in computing experienced when multiplying together 100 different numbers that are all very near – but below – one. Both curves should be approaching 1 or 100% as the number of bets approaches 100. However truncation/rounding error causes both curves to miss their final target (both the probability of winning and the expected value are reported too low in all cases). Because of the way I calculated the probabilities, I would expect the accuracy to decrease as the number of bets placed increases. A linear scaling (probably not the right way to take the errors into account) of the previous chart results in a different chart.