Year-to-Year Correlation of A4PM and Most Increasingly Positive Player Award

One of the questions that often comes up when discussing player metrics involves year-to-year correlation (i.e. how consistent is it across years?). In fact, one of the main criticisms that is levied against adjusted +/- (APM or RAPM) is that it's not "very" consistent. (The quotes are there because this is clearly a somewhat  subjective term.) This post is not going to be about that debate, as it's been done elsewhere many times, and significantly better and more in-depth than I care to spend time on at the moment. But since the question is often asked, and has been raised about my new(ish) A4PM metric, I wanted to address it a bit. It's also a good prelude to looking at "Most Improved Player", or to be safer (by acknowledging that "Improvement" is subject to the validity of the metric), what I'm calling "Most Increasingly Positive" player (according to A4PM) — which is factually true, if nothing else.

To calculate R^2 values (a measure of correlation), I used a lower threshold of 1000 possessions played in 2011 and 2012, which turns out to be 246 players. I ran a simple linear regression weighted by the total possessions played in the two seasons. The function call in R looks like:

```Call: lm(formula = A4PM12 ~ A4PM11 + 0,
weights = POSS11 + POSS12)```

For A4PM, I found that $R^2 = 0.24$ and for RAPM, a somewhat lower $R^2=0.16$. That seems like a fairly large difference, and gives me a bit more confidence that A4PM will turn out to be a good predictor when averaged over multiple seasons (in the same way that Jeremias Engelmann calculates his multi-year RAPM). Of course, I can't say at this point whether A4PM would beat Jerry's "gold standard" RAPM. So let that be a clear disclaimer.

With that said, here are some plots, so you can see for yourself that A4PM appears to give a tighter fit:

Here are the same plots with players, instead of dots:

Click to enlarge.

Click to enlarge.

In these plots, being closer to the upper-right corner means a player has been consistently better in both seasons, whereas being closer to the opposite (lower-left) corner means the player has been consistently worse. To find players who "improved" or "became increasingly positive", look for players closer to the upper-left who had low ratings last season, but more positive ratings this season. A good example here is Larry Sanders or Beno Udrih (according to A4PM).

While I'm on the topic, I might as well hand out the award for Most Improved Player — whoops, I mean the player most increasingly positive according to A4PM in a way that is factually accurate and not at all intended to be inflammatory. Here is a table of Δ-values (A4PM12-A4PM11) for all players who played at least 1000 possessions in 2011 and at least 2500 possessions in 2012. (I've given the possession data in the last two columns, so if you want to use your own cutoffs, have at it).

[

You should keep in mind three things when looking at that table: 1) Regression to the mean; 2) Aging curves; and 3) Noise. Don't be surprised to see young players at the top and older players at the bottom for both reasons. There are some really interesting names on the list at both ends (Tyson Chandler?!). Don't be surprised to see outliers (they're everywhere!). Anyway, it's still an interesting list, and somewhat correlates with the impression of the MIP discussion in the media at-large (e.g. Dragic, Hayward, Rush, Gallo). Looks like Blake Griffin and Russell Westbrook also took significant steps forward this season. Just throwing this out there, but my bet is that next season, one of those two or maybe both will be in serious discussion for league MVP.

Ok, now some more fun plots to wrap up the post. Here are year-to-year plots (with player names) for each of the 8 components of A4PM. But first, a table of $R^2$ values for each factor in descending order:

 RANK FACTOR R^2 1 OEFG 0.20 2 OREB 0.16 3 OFTR 0.16 4 DTOR 0.14 5 OTOR 0.13 6 DFTR 0.12 7 DEFG 0.08 8 DREB 0.07

Going forward, it will be important to keep in mind these values when looking at which players had surprisingly good or bad seasons according to +/-. For example, if a player has a very high DREB rating, it may be more likely due to noise. Conversely, if a player has a high OEFG rating, that is more likely to be "real". Of course, as I said earlier, some form of multi-season averaging is clearly needed to improve predictive capability. Ok, now here are the plots like I promised.

Keep in mind that negative ratings are better for this component (i.e. own team has fewer turnovers when player is on).

Negative rating here means opponent went to the line less often.

Negative rating means opponent had lower eFG%.

DREB is somewhat of a misnomer. Really, this is the change in the opponent's OREB% when the player is on. Therefore, negative ratings are better here.

2 thoughts on “Year-to-Year Correlation of A4PM and Most Increasingly Positive Player Award”

1. A4PM will be better than rAPM, since out-of-sample four-factors have significantly different coefficients than in-sample four-factors. Turnovers be extra important!

2. crow says:

Adjusted 4 factors measures the impact of a player on the team's 4 factors, directly or indirectly. I believe that is the case for your version too. So while change in the factors probably generally involves a player's individual factor productively changing it that direction it may involve change in their indirect impact on that factor. With this data alone you can't say for sure how much of the change was direct or indirect or in which direction for each. But if you split the 4 factors (offense and defense) into direct and imdirect player impacts (or call them local and global impacts) by taking total specific factor impact by the A4PM method - estimated value of direct individual boxscore impact (by one of your other metrics) to find the estimated indirect impact then you would have even more information to work with. And you be doing something I proposed over and over years ago before Joe Sill did the first cut of adjusted 4 factors and something I did with his data after he produced it. Many folks couldn't follow what I was describing again and again or just weren't interested in it. I either partly got the idea for what I saw at protrade years ago or used it as support and validation for the already formed idea.