Starting work on Bayesian football power ratings

The more I learn about Bayesian statistics, the more I want to use the approach in my own sports research. I have found some other football rating systems that use Bayesian methodology (including Ed Kambour and most recently @DSMok1), so most of what I'm doing here is not novel. However, I feel it's important to document new things I'm working on as much as possible, because you can always learn something new from seeing how someone else tackles the same problem. And I usually learn something by forcing myself to write about it. Anyway, that's more or less the pedagogic motivation for this article. In this post, I'll just introduce the framework and starting point for the model and show some initial predictive results from last season. I promise to make this as high-level as possible.

Probability is about calculating the likelihood of a given set of data given some known parameters. The classic example is flipping a coin, which we "know" is equally likely to come up heads or tails. Statistics is essentially the inverse problem (or "inverse probability"), i.e. calculating the likelihood of a parameter or set of parameters lying within a certain range of values, given a known set of data. If you didn't know a coin was "fair", how many flips would it take for you to figure it out? The example I always give to people who know nothing about Bayesian statistics is imagine flipping a coin 10 times, and it comes up heads 8 times. Most people, hopefully, everyone reading this, will know intuitively that just because the coin doesn't come up heads exactly 5 times, it doesn't (necessarily) mean the coin is an unfair one. The purpose of Bayesian statistics, then, is to combine the evidence that a current set of data presents (i.e. 8/10 flips coming up heads) with our "prior knowledge" of the phenomena under study. Figuring out how to do that is where all the math stuff comes in handy, and up until the early 90's, it was actually a very difficult problem to solve in all but a few well-studied "toy" problems, because there was not enough computational power to tackle the really complex models.

Nowadays, any statistical hack like myself can download the necessary software tools for free, and run extremely sophisticated models on a pc or even a laptop (I'm doing this on a MacBook Air) that 20 years ago probably would have required a supercomputer (or two). Unquestionably, the computational tool that kickstarted the widespread Bayesian revolution in statistics over the past two decades (and the one that anyone can download for free), is BUGS, (Bayesian inference Using Gibbs Sampling). I'm using JAGS (Just Another Gibbs Sampler), which is essentially like BUGS, but runs easily within R on a Mac using the package rjags.

The way these programs work is that you enter in the data along with your data model and prior knowledge. The data model and the prior knowledge is typically in the form of a probability distribution. Given all this information, the program crunches through thousands and thousands of simulations of the model with the given set of data (using so-called Monte Carlo methods), which in the end generates a probability distribution for the parameters of the model you are interested in. This is called the "posterior" distribution. Along with that, you can feed in other parameters to the model that you want predicted. Let me show you how it works with the NFL model I've been setting up.

model {
  for (i in 1:N) {
	mov[i] ~ dnorm(mov.hat[i], tau1)
	mov.hat[i] <- b0 + inprod( b[] , x[i,] )
}

This part describes the model for the margin of victory (mov) when two teams face each other. It says that the samples are from a standard normal distribution, the mean of which is given by the difference between the power ratings between the two teams plus home field advantage (b0), and the variance of which will be estimated by the model during the simulation. Think of this as the error that comes with any prediction.

  for (i in 1:32) {
    b[i] ~ dnorm(0, tau2)
  }

Here, we're setting the prior distribution for the array of 32 coefficients that represents each team's power ratings. It's again a normal distribution, but we specify the mean to be 0 (i.e. we "know" this), and a second variance parameter, tau2, which will also be determined by the simulation.

b0 <- 3
tau1 ~ dexp(0.1)
tau2 ~ dexp(0.1)

Here we set the homefield advantage to a "known" constant (as opposed to estimating it from the data), and we set exponential priors on the two variance parameters. What you should note so far is that all we have explicitly said is "known" are that the mean of all team ratings should be zero and the HFA = 3 points, neither of which are controversial assumptions. Everything else is extremely general, and we just wait for the simulation to tell us what the actual parameter values turn out to be.

The data fed into the model consisted of the game results from the regular season. The ratings produced by the simulation were as follows:

Based on the regular season, that looks about right. Looking at a histogram of the distribution of team ratings, it looks normal as would be expected.

Distribution of team ratings.

Within the simulation, itself, one can make predictions about future events simply by monitoring certain variables of interest. In this case, I set up the simulation to monitor "virtual" matchups between teams that actually met in the post-season. In other words, I used the model based on regular season data to predict (out-of-sample) post-season results. For each game, we can look at the resulting probability distribution. Here's the prediction for SF vs. NYG as an example:

Bayesian prediction for Niners-Giants NFC Championship game.

You can see that according to the simulation, the Giants had almost no chance of winning the game. The probability lies almost entirely to the right of zero and is centered around 8.44 points. With hindsight, we can say that either the simulation was wrong incomplete, or the Giants got extremely lucky. A little luck was involved, to be sure, but I think it's safe to say the model is far from perfect at this point. Let's take a look at the predictions for all the post-season games:

So how did the model do overall? Well, to be perfectly honest, not all that great. It beat the spread 6/11 times, but lost big time to the Vegas spreads in terms of error (least squares columns). In non-Giants games, it beat Vegas spreads 6/8 times, but there always seems to be a "hot" team in January, and it doesn't help bettors to realize that only after the SB has been played. Still, this is just scratching the surface of what Bayesian models are capable of doing. As others have pointed out, we can try to build in the effects of turnovers and also try to account for trends within and across seasons. Much remains to be done. Let's go!

Why you should be excited and nervous to be a Chicago Bulls fan in 2012-13


I'm endeavoring to do a post for all 30 teams before the season starts on why if I were a fan of that team, I'd be excited and nervous. I'll talk about the great, the good, the bad, and oftentimes, the really ugly.

The Bulls are on pause. After Derrick Rose went down with an ACL tear against Philly in Game 1 of the first round of the Playoffs, not only were Chicago's hopes for a 7th Championship dashed in the short term, but it quickly became clear that this season would be a placeholder for those hopes going forward. Continue reading

A look at scoring productivity for several "Player of the Century" candidates

This is going to be quick hit, basically just an excuse to put up a cool plot. A while back I developed a relatively simple metric for measuring scoring productivity that combines usage (i.e. volume) and shooting efficiency. I also did some work on aging curves using that metric. Motivated by Dime Mag's naming Kobe Bryant the "Player of the Century" (or, at least, the first 0.12 parts thereof), I thought I'd re-visit my scoring metric, and post aging curves for several of Kobe's potential rivals for that honor. Enjoy, debate, stare blankly, whatever! (Keep in mind this doesn't account for defense or any aspects of offense aside from scoring.)

Aging curves for several POC candidates. Click to enlarge.

Why you should be excited and nervous to be a Charlotte Bobcats fan in 2012-13

I'm endeavoring to do a post for all 30 teams before the season starts on why if I were a fan of that team, I'd be excited and nervous. I'll talk about the great, the good, the bad, and oftentimes, the really ugly.

Let's get this out of the way. If you're a Bobcats fan, you know there's a steep climb ahead. The good news is that Michael Jordan hired Rich Cho, and by doing so, may have finally got the team headed in the right direction. In fact, this was evident to me on Draft Night when they actually took the second best player available in Michael Kidd-Gilchrist. Very few people expected them to do the right thing there, perhaps, because it was unclear who was calling the shots. If you remember, MJ doesn't exactly have a strong record as far as evaluating other people's talent. Continue reading

Why you should be excited and nervous to be a Brooklyn Nets fan in 2012-13


I'm endeavoring to do a post for all 30 teams before the season starts on why if I were a fan of that team, I'd be excited and nervous. I'll talk about the great, the good, the bad, and oftentimes, the really ugly.

No sweep 'til Brooklynnnnn.

First off, sweet logo. Sweet new faux-weatherized steel arena. Even if the Nets remained terrible, fans will have a lot to be excited about in Brooklyn. The good news is the Nets probably won't be terrible. They'll almost surely be quite a bit better than they've been most of the past decade. Continue reading

Why you should be excited and nervous to be a Boston Celtics Fan in 2012-13


I'm endeavoring to do a post for all 30 teams before the season starts on why if I were a fan of that team, I'd be excited and nervous. I'll talk about the great, the good, the bad, and oftentimes, the really ugly.

First, and most importantly, you should be excited to be a Celtics fan because you still have Kevin Garnett, Paul Pierce, and Rajon Rondo. Garnett and Pierce are inching ever closer to retirement, but it's pretty clear that each is still in the upper echelon of NBA players. Garnett can still defend and shoot long 2's and Pierce is still one of the game's best offensive threats. As evidence, note that Pierce shot 56.7% TS on 28.1% USG. Only Karl Malone has shot with a higher efficiency on that much volume after age 34. The combined RAPM of Garnett (+5.6), Pierce (+1.8), and Rondo (+1) is +8.4. A team with that rating (for a full 48 mpg) would win about 63 games. Continue reading

Addendum on Long Twos: Rankings using Multiple Years

Using the same multi-level model as in the last post, but with shooting data compiled since 2000-01 (the oldest year available in Basketball-Reference's PlayIndex+ tool), here are the rankings for current players (defined as having > 25 2-point FGA greater than 18 feet). The "10-YR %-ile" represents the percentile rank of that player among all 728 players since 2001 that had at least 25 long 2-pt FGA. (Astute readers will note that it's really an 11-YR %-ile, and hopefully forgive me for rounding to a nicer sounding number.)

Go to Google Spreadsheet

Stephen Curry is a great shooter, but he ain't *that* great

Don't worry, I've got mad love for ya, Steph.

I know you're already foaming at the mouth after reading the title, but that's the only way I thought I could you lure you in for what follows. Here's a more appropriate title, but one you might have never thought twice about clicking on:

A method for predicting FG% in basketball based on small number of shots

So here's a list of the top 20 players sorted by FG% on long twos (I'm defining a "long two" as 18-23 ft). The data was acquired using the PlayIndex+ tool over at basketball-reference. Here is the exact query, if you want to see the data for yourself.

Top 20 Players by Maximum-Likelihood Estimate (MLE)

MLE is simply a statistics shorthand for "taking the straight mean" (actually, it means a lot more than that, but for our purposes, I think that's good enough).

PLAYER TEAM MLE FGA
Stephen Curry GSW 68.30% 63
Steve Novak NYK 59.50% 37
Kurt Thomas POR 53.10% 64
Quincy Pondexter MEM 52.90% 34
Joakim Noah CHI 52.00% 25
Daequan Cook OKC 51.70% 29
Boris Diaw TOT 51.20% 41
Brandon Rush GSW 50.90% 57
Jonas Jerebko DET 50.80% 61
Kevin Garnett BOS 50.20% 237
Kris Humphries NJN 50.00% 36
Yi Jianlian DAL 50.00% 32
Chris Paul LAC 49.60% 127
Michael Beasley MIN 49.30% 69
Tim Duncan SAS 49.10% 114
Kevin Durant OKC 48.80% 121
Brandon Bass BOS 48.70% 119
Anthony Parker CLE 48.60% 70
Dirk Nowitzki DAL 48.60% 175
Charles Jenkins GSW 48.00% 127

The problem we face here is that the sample sizes vary so much between players and some are really small. Is Yi Jianlian really a 50% shooter from 18-23 ft? (If he is, somebody should probably sign him already.) What about Kris Humphries? Maybe not. Conversely, Dirk shot 48.6% on 175 FGA from this range, which is a larger sample size and about where we might expect him to be. Overall, the pool of 248 players with >25 FGA from 18-23 ft had a mean FG% of roughly 38%.

What can help us "shrink" these estimates closer to the mean is taking into account the variation between players in both FG% and FGA (i.e. sample size). One way to do this is to use a multi-level model (the other is a fully Bayesian approach, which Gelman says is roughly equivalent when there are a large number of "groups" such as this). If you're interested in this type of model, I highly recommend Gelman's "ARM" book.

In R, creating the model basically takes one step:

twos.mlm=lmer(cbind(FGM,FGA-FGM)~(1|PLAYER),family=binomial(),data=long_twos)

From that, I get a list of coefficients (called random effects) which can then be converted to our new (hopefully) more predictive FG%'s.

Before I show the new list of players and their estimates, take a look at how the spread of the histogram of FG%'s shrinks when going from the MLE to the multi-level model:


Now here's the top 50 according to their multi-level estimates:

Top 50 18-23 ft FG%

The column MULTI is the multi-level estimate.

PLAYER TEAM MLE MULTI FGA
Stephen Curry GSW 68.30% 46.9% 63
Kevin Garnett BOS 50.20% 45.5% 237
Dirk Nowitzki DAL 48.60% 43.7% 175
Chris Paul LAC 49.60% 43.3% 127
Kevin Durant OKC 48.80% 42.8% 121
Tim Duncan SAS 49.10% 42.8% 114
Brandon Bass BOS 48.70% 42.8% 119
Jose Calderon TOR 47.30% 42.6% 148
Charles Jenkins GSW 48.00% 42.6% 127
Kurt Thomas POR 53.10% 42.6% 64
Pau Gasol LAL 47.30% 42.3% 129
Steve Novak NYK 59.50% 42.3% 37
LaMarcus Aldridge POR 45.20% 42.2% 208
Drew Gooden MIL 45.60% 41.9% 158
Sebastian Telfair PHO 47.50% 41.9% 101
Jonas Jerebko DET 50.80% 41.8% 61
Anthony Morrow NJN 46.80% 41.7% 109
Steve Nash PHO 47.40% 41.7% 95
Michael Beasley MIN 49.30% 41.6% 69
Brandon Rush GSW 50.90% 41.6% 57
Jamal Crawford POR 45.10% 41.5% 144
Anthony Parker CLE 48.60% 41.4% 70
Ben Gordon DET 44.30% 41.3% 158
Nick Young TOT 44.20% 41.3% 163
Darren Collison IND 46.20% 41.1% 91
Chris Bosh MIA 44.10% 41.1% 152
Arron Afflalo DEN 47.20% 41.1% 72
Klay Thompson GSW 43.70% 41.0% 158
Boris Diaw TOT 51.20% 40.9% 41
David West IND 45.60% 40.9% 90
Quincy Pondexter MEM 52.90% 40.9% 34
Russell Westbrook OKC 43.50% 40.8% 147
D.J. White CHA 45.60% 40.7% 79
Marreese Speights MEM 44.70% 40.7% 94
Carlos Boozer CHI 45.70% 40.6% 70
David Lee GSW 44.40% 40.5% 90
Jarrett Jack NOH 44.40% 40.5% 90
Kris Humphries NJN 50.00% 40.4% 36
Steve Blake LAL 47.10% 40.4% 51
Daequan Cook OKC 51.70% 40.4% 29
Jared Dudley PHO 43.10% 40.2% 109
Yi Jianlian DAL 50.00% 40.2% 32
Jason Smith NOH 43.00% 40.2% 107
DeMarcus Cousins SAC 42.20% 40.2% 147
Joakim Noah CHI 52.00% 40.1% 25
Spencer Hawes PHI 46.00% 40.1% 50
Grant Hill PHO 43.60% 40.0% 78
Jason Terry DAL 42.70% 39.9% 96
Nate Robinson GSW 43.70% 39.9% 71
Ramon Sessions TOT 43.90% 39.9% 66

Now it's starting to make more sense. Here's the bottom 50:

Bottom 50

PLAYER TEAM MLE MULTI FGA
Glen Davis ORL 24.50% 33.3% 94
John Wall WAS 29.50% 33.7% 183
Corey Maggette CHA 26.50% 33.9% 98
Dorell Wright GSW 19.60% 34.2% 46
Andray Blatche WAS 24.20% 34.3% 66
Paul George IND 22.80% 34.3% 57
Markieff Morris PHO 22.20% 34.3% 54
Ivan Johnson ATL 20.80% 34.4% 48
Daniel Gibson CLE 21.30% 34.5% 47
Antawn Jamison CLE 31.30% 34.8% 166
DeMar DeRozan TOR 32.20% 34.9% 205
Carlos Delfino MIL 24.00% 35.0% 50
Paul Pierce BOS 28.90% 35.0% 90
Josh Howard UTA 28.80% 35.2% 80
John Lucas CHI 28.60% 35.2% 77
Byron Mullens CHA 32.80% 35.3% 204
Austin Daye DET 25.00% 35.3% 48
Leandro Barbosa TOT 30.80% 35.4% 104
Tracy McGrady ATL 29.00% 35.5% 69
C.J. Watson CHI 30.00% 35.6% 80
Metta World Peace LAL 22.90% 35.6% 35
Chauncey Billups LAC 20.70% 35.7% 29
Jeremy Pargo MEM 20.70% 35.7% 29
Marcus Camby TOT 25.60% 35.7% 43
Danilo Gallinari DEN 29.40% 35.7% 68
C.J. Miles UTA 29.20% 35.7% 65
Lamar Odom DAL 23.50% 35.8% 34
James Johnson TOR 30.90% 35.8% 81
Luc Mbah a Moute MIL 20.00% 35.9% 25
Andrew Goudelock LAL 24.20% 35.9% 33
Andre Iguodala PHI 32.80% 35.9% 122
J.J. Hickson TOT 28.30% 36.1% 46
Brandon Knight DET 32.30% 36.1% 93
Wesley Johnson MIN 32.30% 36.1% 93
Jodie Meeks PHI 25.00% 36.1% 32
Norris Cole MIA 32.20% 36.1% 90
Derrick Brown CHA 29.60% 36.1% 54
Monta Ellis TOT 34.40% 36.2% 195
Dominic McGuire GSW 27.50% 36.2% 40
Tyreke Evans SAC 33.30% 36.2% 120
Tyler Hansbrough IND 32.10% 36.3% 78
Travis Outlaw SAC 28.20% 36.4% 39
Courtney Lee HOU 32.50% 36.4% 83
Earl Clark ORL 27.80% 36.4% 36
Zach Randolph MEM 24.00% 36.4% 25
Ray Allen BOS 31.70% 36.5% 63
Danny Granger IND 33.70% 36.6% 95
Jordan Farmar NJN 30.20% 36.6% 43
Josh Smith ATL 35.80% 36.6% 316
Marvin Williams ATL 33.70% 36.6% 92

Conclusions

Now, do we have any evidence that Stephen Curry is closer to a 46.9% shooter from 18-23 ft rather than truly being a 68.3% shooter according to the 63 shots he took last year? Sure we do! Just go back to 2010-11 when he shot 49.1% on 214 FGA (still great!). Or his rookie season when he shot 47% on 232 FGA (also great!). Now that 46.9% makes a lot more sense, right? (By the way, the title of this post should make some more sense right about now, too.) Stephen Curry is a great shooter, he just ain't *that* great.

Just as Stephen Curry probably isn't a near-70% shooter on long 2's, Glen Davis is probably better than a 25% player on those shots. Indeed, in 2011, Davis shot 35.8% on 226 FGA. And he was a 38.9% in 2010, but on only 36 FGA. You know, it's important to point out that just because a player takes a small number of attempts doesn't necessarily mean he'll be at the top or bottom of lists like this. Sometimes the player will "randomly" fall in the middle, too.

So, hopefully, this made some sense to you. Next time you see an analyst talking about how a player lead the league with an astronomically high FG% on 12 shots of a certain type (say in the 4th quarter of games on the road on Sundays), think about this post. Heck, maybe e-mail the guy a link to it.

Why You Should Be Excited and Nervous to Be An Atlanta Hawks Fan for 2012-13

I'm endeavoring to do a post for all 30 teams before the season starts on why if I were a fan of that team, I'd be excited and nervous. I'll talk about the great, the good, the bad, and oftentimes, the really ugly. The posts will be in alphabetical order starting with Atlanta.

Having moved back to the Bay Area after living in Atlanta since 2008, I just have to say, you chose a fine time to radically change the team, guys. Ok, make that just one guy: Danny Ferry, who was brought in as the new GM this off-season. After being mired in the mode of "getting to the playoffs every year only to be eliminated early every year", it took new leadership to do the two things that everyone around the league had been saying to do for years: 1) Break up the team; and 2) Get some more guys who can shoot the damn ball. Continue reading

Which Set of Rankings Passes The Eyeball Test?

It would be great if statistical rankings and "eyeballs" matched up, right? I've got the rankings for three different stats (two that you've seen before on my blog, and one that is new and will be unveiled in a subsequent post). I'm curious to see which set of rankings people "prefer". This is not a good way to decide which stat is better (that should be done more objectively), but perhaps, it will turn out that the "best" (most predictive) stat is also the one people "like the best". That's why I'm showing the rankings before I talk about the new stat, so as not to bias the poll results.

In the comments section, you can discuss specifically why you voted the way you did. I think that will be helpful, actually.

A Grown Man NBA Blog