A Comparison of Regressed Four Factor Weights vs. Theoretical Expectations

You're probably thinking to yourself, what the hell does that title mean, right?

Actually, it's not as cryptic as it sounds (hopefully). I recently determined the following regression equation which relates the adjusted four factors to (my) 2.5-year RAPM (and I say "my" RAPM, to distinguish it from the prior-informed "gold standard" RAPM that Jeremias Engelmann publishes on his website):

A4PM = 1.77 * eFG%(own) - 1.60 * eFG%(opp) \+ 0.14 * FTA/FGA(own) - 0.11 * FTA/FGA(opp)\ -1.37 * TOV%(own)+ 0.80 * TOV%(opp)\ + 0.31 * OREB%(own) -0.41 * OREB%(opp)

I thought it would be informative to compare the coefficients determined by regression to what we might reason them to be...in theory. Hence, the title of the post.

So, let's take the terms one factor at a time.


What is the +/- equivalent of a 1 %-point increase in team-level eFG%? Well, the league average eFG% is roughly 50%. (As of this post in the current season, it's roughly 48%, but usually it's around 50%, and to be honest, it won't make a significant difference in our calculation.) Assuming an average team takes about 84.5 2-pt FGA (counting 3-pt attempts as 1.5 2-pt FGA) per game, then a 1 %-point increase in eFG% creates:

0.01*2*84.5=1.69 pts

We can do an alternative calculation that takes into account the fact that missed shots sometimes result in offensive rebounds:

84.5*2*(0.01)+1.07*84.5*(-0.01)*0.26=1.45 pts

The first term is the same as above, and the second term assumes that missed shots result in offensive rebounds. The negative sign is a bit counterintuitive, but the idea is that a slightly more efficient shooting rate results in slightly fewer points created after offensive rebounds (since there are fewer misses to rebound!).

Looking at the regression coefficients (1.77/1.60), we see better agreement with the first equation. Maybe my proposed "correction" for OREB is incorrect (well, I am wrong sometimes).


This one is straightforward to determine theoretically. The average team turns the ball over 15 times per 100 possessions (15%). A 1 %-point increase in TOV% results in 16 turnovers or 1 fewer offensive possession. Since the average possession results in about 1.07 points, that is what we should expect the regression coefficient to be. Compare that to 1.37 on offense and 0.80 on defense, and you can see there is some disconnect. The order is about what we should expect (~1), but there is significantly more importance (penalty) placed on offensive turnovers compared to the reward given to players for creating turnovers on defense. Not sure why yet, but it's probably too small a difference to simply be chalked up to noise.


The average OREB% is about 26% and there are about 41 total rebounds per NBA game, meaning there are roughly 11 ORB and 29 DRB acquired by an average team. Instead of looking at a 1 %-point increase in ORB, let's consider an increase of 1 ORB, so 12 ORB and 28 DRB. That would be equivalent to 30% OREB% or an increase of 4 %-points. That 1 extra rebound should result in 1.07 additional points on average, so we can reason that a 1 %-point increase in OREB% results in roughly 1.07/4=0.27 pts. Indeed, that is not too far off the regression coefficients (0.31/0.41), although there seems to be more weight on the defensive end. My thought is that if we split the difference (0.36) and multiply by 4 (doing the inverse of the previous operation), we get 1.44 pts, which I'm guessing is roughly the expected points after an offensive rebound. Don't quote me on it, but that seems like a reasonable estimate (we're all about reason here). At any rate, once again, the regression results appear more or less intuitive based on our theoretical estimates.


There are roughly 30 FTA per 100 FGA. What does 1 extra FTA mean? Your initial thought is probably 1 point, but the regression coefficient is only 0.11/0.14, so it probably can't be that far off. What's the problem? Well, remember that when a player gets fouled, the alternative would have been a FGA or a TOV. The estimate that people have come up with is that 0.44 FTA is roughly equivalent to 1 possession. So that 1 extra FTA is roughly 0.44 possessions. We would expect 1.07*0.44=0.47 pts scored on that "partial" possession. But we would expect that extra FTA to result in 0.76 points (76% FT% * 1 FTA). The difference is 0.76-0.47=0.29 pts. That's better, but it's still quite a bit higher than the coefficient determined by regression. Of the four factors, this one appears to be the most disconnected from theory. If you have any ideas, let me know!


2 thoughts on “A Comparison of Regressed Four Factor Weights vs. Theoretical Expectations”

  1. When I did four-factors out-of-sample offensive control, FTR was the one that didn't add up to roughly 100%
    i.e.where FTR= teamA OFTR * a + teamB * DFTR * b
    (a+b) < 1

    Perhaps this is from the error from referees?

Leave a Reply