The Problem – Turning Numbers into Curves
It’s an NFL Sunday and the whole betting market has Justin Jefferson lined at 85 receiving yards. Meanwhile, you notice that a new sportsbook, DraftDuel, has its line set at 82 receiving yards. Is this enough of a difference for you to bet? What about if his line was 75 or 80 yards? In these situations, you only have a single number, a value you are treating as the “true” median. Your goal is to use that singular number to price a bet at a different market price. The line at DraftDuel can obviously reach a specific point where it creates a good bet. Yet, defining exactly where that point is a challenging problem.
Or maybe you build your own projections or have access to a solid projection source. Your projection model projects Justin Jefferson to have 88 yards? Is it profitable to bet his over, under, or neither? Again, you must be able to convert point estimates, or single numbers, into probability distributions. But, in this case, you have to handle this situation differently than using market lines because you have a mean instead of a median.
The Easy Solutions – Helpful but Wrong
The normal distribution is the distribution most people are most familiar with. It resembles a classic bell-curve. People most commonly misapply it to sports betting in the following ways:
- 1. Applying the normal distribution to underlying data that isn’t normally distributed.
- 2. Incorrectly calculating the standard deviation of the distribution.
Very few types of bets have an underlying probability distribution that are well suited for the normal distribution. Below is a sample distribution of what Justin Jefferson’s probability distribution (blue shaded area) may look like if his median receiving yards is 85. The plot compares it to a normal distribution with the same median and variance (red dashed line).
A stat type like receiving yards is going to be right skewed. Most different player specific bet types are also right skewed. Additionally, to use the normal distribution, you must estimate the mean and standard deviation. This becomes a challenge as:
- Unlike the medians in the form of market lines, they aren’t publicly available anywhere.
- There is no “correct” way to solve for it. Different stat types and even individual players for the same stat type will have very different standard deviations. For example, Tyreek Hill and Justin Jefferson could have the same receiving yard line. However, their probability distributions, shown in the sample below, could be very different.
The Poisson distribution is popular in sports betting. It only requires estimating one parameter: the mean. In contrast, most other distributions, like the normal distribution discussed above, need multiple parameter estimates. This means you just need an average projection for a stat to be able to quantify any line. For example, if I know Joe Burrow’s projected passing touchdowns in a game are 1.7, I can determine if betting on O/U 1.5 -115 is a good bet by plugging in the mean of 1.7 into a Poisson calculator.
The Poisson distribution is simple and extremely useful in sports betting. Yet, I see it frequently misapplied. There is a list of assumptions that must be met to use the Poisson distribution. However, the most broken assumption is that the stat doesn’t occur “one event at a time.” For example, the table below lists some stat types that are suitable for Poisson and some that are not. Misapplying the Poisson distribution leads to wrong and often overstated calculated edges.
|OK for Poisson
|Not OK for Poisson
My Solutions – Math and Common Sense
The best way to convert point estimates into probability distributions is to use an appropriate distribution for the type of bet you are trying to price. For me, this has taken the form of:
- Branching out to use more distributions than the most common Normal and Poisson. This article is definitely not the place to expand upon these. However, some of the most useful for me personally have been negative binomial, beta, or empirical if I have enough data. Many others could also apply to sports betting. It just depends on what you’re betting on.
- Using compound distributions instead of single distribution. For example, a player’s points in basketball cannot be modelled using a simple Poisson. Multiple “events” can happen at once. For instance, a player might score two or three points with a single shot. However, an improvement would be to model the player’s free throws, two-point field goals, and three-point field goals as separate Poisson processes. Then, add them together. This still isn’t great due to potential correlations between three distributions, but it is at least a step in the right direction.
Very few people will or should care about the two points I just wrote. The good news is that I don’t think you need to do either of those things to win or price most player props. Here are my tips to people who want to bet who have no interest in the more math-based approaches above:
- Most player props are right skewed. If you are using a projected mean from a data source, it is vital that you remember this. You cannot compare a mean (projection) against a line (median) and bet based on that. Almost all player stats are right skewed and therefore the median will be lower than the mean. Compare your projection source to the lines for the full set of your projections. It will give you an idea of right skewed that specific stat is depending on how much higher your projections are than the lines on average.
- Go through the mental exercise of pricing the middle. A helpful exercise that someone can do is to think about what price you would take on the middle to intuitively compare two lines. Consider this example from an odds screen:
In this case, let’s take for granted FanDuel’s line is correct and the fair value for this line at 20.5. The odds screen is then trying to determine given the fair median is 20.5, what is a fair price for 19.5? A helpful exercise is to try to think of the odds you would need to bet the middle. Or, in this example, what odds would you need to bet that RJ Barett has exactly 20 points plus rebounds? The 57.31% chance of winning shown here implies that this odds screen thinks RJ Barrett will have exactly 20 points + rebounds 7.31% of the time. Fair odds for the middle would be about +1270. If the sports book offered a bet of exactly 20 points + rebounds, does that seem like a fair price, too high, or too low? Using this logic can help you try and intuit your way to a fair price, and if nothing else, identify clear bad plays.
Lastly, use common sense. If you do not have precise methods for turning point estimates into probability distributions, you will have to rely on common sense and intuition. In general, most people understate randomness that overstate their edges. It is better to pass on a questionably good bet than to make a bad one.