Quantifying Contributions of Individual Play Types to Overall Team Shooting Efficiency

In my last post I summarized 2010-11 Synergy Data. Here, I'm going to dig a little deeper into those stats. Just to refresh your memory, here are the 11 play categories that Synergy tracks:

• Isolation (ISO)
• P&R Ball Handler (BALL)
• Post-Up (POST)
• P&R Man (ROLL)
• Spot-Up (SPOT)
• Off Screen (SCREEN)
• Hand off (HAND)
• Cut (CUT)
• Offensive Rebound (REB)
• Transition (TRANS)
• All other plays (OTHER)

Recall that the overall PPP for a team is:

$PPP=Sigma_{i=1}^{11} PPP_i*RATE_i$

$Sigma_{i=1}^{11} RATE_i = 1$

What I want to know is how the efficiency ($PPP_i$) and frequency ($RATE_i$) of each play contributes to overall $PPP$ on offense and defense. One way of addressing this is to create a multiple linear regression model using some or all of these quantities as parameters. So, I did that. Here are the results:

Offense

(Note that all values were standardized before running the model.)

```Call:
lm(formula = TOT ~ SPOTPPPI + REBPPPI + TRANSPPPI + TRANSRATE +
ISOPPPI + POSTPPPI + OTHERPPPI + ISORATE, data = PPP2011_off_pivot)

Residuals:
Min       1Q   Median       3Q      Max
-0.39784 -0.13765  0.02817  0.11230  0.35331

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.004622   0.042871   0.108  0.91516
SPOTPPPI     0.526211   0.056530   9.308 6.69e-09 ***
REBPPPI      0.288688   0.048475   5.955 6.54e-06 ***
TRANSPPPI    0.253230   0.055811   4.537  0.00018 ***
TRANSRATE    0.247018   0.048504   5.093 4.82e-05 ***
ISOPPPI      0.205189   0.062470   3.285  0.00353 **
POSTPPPI     0.197200   0.052158   3.781  0.00110 **
OTHERPPPI    0.172761   0.049844   3.466  0.00231 **
ISORATE     -0.139141   0.049104  -2.834  0.00995 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2348 on 21 degrees of freedom
Multiple R-squared: 0.9601,	Adjusted R-squared: 0.9448
F-statistic:  63.1 on 8 and 21 DF,  p-value: 5.974e-13```

The parameters are listed in descending order of importance according to the regression coefficients. Only 8 of a possible 22 variables were needed to reach 95% adjusted $R^2$. Note that only two rates appear on the list: TRANS and ISO. Interestingly, the model suggests that the frequency of running isolation plays is inversely correlated with PPP. The other six parameters in the model all represent efficiencies: SPOT, REB, TRANS, ISO, POST, and OTHER. None of the pick and roll or screen plays made the cut. Let's look at how teams stacked up in these categories (except for PPP, all values shown here have been standardized):

MODEL represents the value predicted by the model, which should be compared to SDPPP. The columns have been arranged left to right (starting with SPOTPPPI) in order of importance in the model.

 RANK TEAM PPP SDPPP MODEL SPOTPPPI REBPPPI TRANSPPPI TRANSRATE ISOPPPI POSTPPPI OTHERPPPI ISORATE 1 DEN 0.99 1.63 1.54 1.99 0.23 0.05 1.30 0.49 0.98 0.61 2.21 2 MIA 0.98 1.32 1.39 0.79 -0.16 1.77 0.27 1.81 0.37 0.44 0.12 3 DAL 0.98 1.32 1.43 1.48 -1.12 -0.44 0.14 1.37 2.36 0.61 -1.45 4 SAS 0.98 1.32 1.33 1.30 0.43 0.54 -0.34 1.37 -0.09 0.28 -1.15 5 PHO 0.98 1.32 1.29 1.65 1.01 -0.93 0.00 -0.84 0.37 1.75 -1.22 6 NYK 0.97 1.00 0.90 0.79 0.04 -0.44 0.69 1.15 1.44 0.44 1.26 7 HOU 0.97 1.00 1.40 1.65 -0.74 0.54 0.41 0.93 0.37 1.42 0.04 8 OKC 0.97 1.00 0.85 -0.24 0.62 1.52 0.69 0.71 -1.47 3.06 0.99 9 LAL 0.96 0.69 0.37 0.27 1.20 -0.69 -1.30 1.37 0.98 0.28 1.03 10 BOS 0.96 0.69 0.61 -0.58 0.81 0.79 0.62 0.49 0.83 -0.70 -1.34 11 GSW 0.95 0.38 0.02 1.13 -1.90 1.28 0.69 0.27 -1.63 -0.87 0.76 12 UTA 0.95 0.38 0.26 -0.41 0.04 0.54 0.75 0.27 0.98 -1.36 -0.92 13 ORL 0.95 0.38 0.26 0.79 2.17 -2.41 -1.30 -0.18 0.68 -1.20 -1.87 14 CHI 0.94 0.06 0.03 -0.76 0.62 0.79 -0.41 1.59 -0.86 -0.87 -1.03 15 PHI 0.94 0.06 0.20 -0.76 -0.54 1.03 1.30 -0.40 0.98 0.93 0.73 16 ATL 0.94 0.06 -0.13 0.10 1.20 -0.20 -0.55 -0.62 0.06 -0.70 0.76 17 NOH 0.93 -0.25 0.07 -0.24 0.23 1.28 -1.78 0.49 0.37 0.44 0.04 18 DET 0.93 -0.25 -0.26 0.45 0.43 -0.44 -0.07 0.04 -1.01 -0.70 1.30 19 MEM 0.93 -0.25 0.01 -1.44 1.78 0.54 0.07 -0.18 0.37 -0.05 -0.53 20 POR 0.93 -0.25 -0.54 -0.76 -0.74 1.28 -1.51 -1.06 0.06 0.93 -1.15 21 TOR 0.92 -0.56 -0.63 -1.44 -0.16 -0.20 0.96 0.27 0.52 -1.03 -0.04 22 CHA 0.91 -0.88 -0.70 -0.93 1.40 0.30 -1.16 -0.62 0.06 -1.20 0.57 23 IND 0.91 -0.88 -0.76 -0.76 -0.54 -0.69 1.16 -0.62 -0.25 -0.87 -0.04 24 NJN 0.91 -0.88 -1.16 -0.58 -1.32 -0.44 -1.37 -0.62 0.22 -0.05 -0.50 25 LAC 0.91 -0.88 -1.05 -1.10 -0.54 -0.20 0.55 -2.16 -0.25 0.12 -0.50 26 MIL 0.89 -1.51 -1.33 -0.41 -1.12 -0.44 -1.71 -0.40 -1.17 -0.05 -0.50 27 MIN 0.89 -1.51 -1.16 0.45 -0.54 -1.18 -1.16 -1.06 -1.01 -1.20 0.23 28 SAC 0.89 -1.51 -1.57 -1.27 -0.74 -0.93 0.75 -1.06 -1.78 0.12 0.65 29 CLE 0.89 -1.51 -1.54 -0.58 -1.32 -1.92 0.75 -1.28 -1.17 -0.21 0.15 30 WAS 0.89 -1.51 -1.27 -0.76 -0.93 -0.69 1.51 -1.28 -1.32 -0.54 1.34

Defense

```Call:
lm(formula = TOT ~ SPOTPPPI + POSTPPPI + BALLPPPI + TRANSPPPI +
OTHERPPPI + ISOPPPI + REBPPPI + TRANSRATE, data = PPP2011_def_pivot)

Residuals:
Min       1Q   Median       3Q      Max
-0.36528 -0.15750  0.04531  0.12798  0.28213

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.001084   0.038752   0.028  0.97795
SPOTPPPI    0.405595   0.054487   7.444 2.56e-07 ***
POSTPPPI    0.255369   0.045460   5.617 1.42e-05 ***
BALLPPPI    0.193931   0.060888   3.185  0.00446 **
TRANSPPPI   0.187854   0.055651   3.376  0.00286 **
OTHERPPPI   0.155197   0.043497   3.568  0.00182 **
ISOPPPI     0.153785   0.048829   3.149  0.00484 **
REBPPPI     0.134633   0.045206   2.978  0.00717 **
TRANSRATE   0.131741   0.046743   2.818  0.01029 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2122 on 21 degrees of freedom
Multiple R-squared: 0.9673,	Adjusted R-squared: 0.9548
F-statistic: 77.63 on 8 and 21 DF,  p-value: 7.484e-14```

One might expect the defensive model to be virtually the same as on offense, but that is not entirely the case here. There are several differences. The ISO rate drops off the list, while BALL efficiency moves onto the list.  TRANSRATE and REBPPPI move down the list. POSTPPPI moves up slightly. What retains importance, not surprisingly, is defending SPOT plays.

Note that on defense, negative standardized values are better.

 RANK TEAM PPP SDPPP MODEL SPOTPPPI POSTPPPI BALLPPPI TRANSPPPI OTHERPPPI ISOPPPI REBPPPI TRANSRATE 1 CHI 0.84 -2.15 -1.98 -2.11 -0.48 -1.08 0.04 -0.97 -2.81 -1.24 -0.36 2 BOS 0.84 -2.15 -1.78 -1.47 -1.60 -1.33 -1.04 0.05 -1.02 -0.51 -0.82 3 ORL 0.86 -1.42 -1.64 -1.03 -0.70 -2.29 -1.25 0.82 0.26 -1.97 -2.00 4 MIL 0.86 -1.42 -1.38 -1.68 0.20 -1.08 -0.39 -0.72 -1.79 -0.15 -0.46 5 MIA 0.87 -1.06 -1.34 -1.25 -0.70 -1.33 -0.82 0.05 -0.77 -1.06 0.09 6 LAL 0.87 -1.06 -1.06 -0.39 -2.50 -1.08 -0.39 0.56 -0.51 -0.15 0.18 7 PHI 0.88 -0.69 -0.48 -0.39 -0.03 0.84 -0.61 -0.46 0.26 -1.61 -0.91 8 DAL 0.88 -0.69 -0.74 -1.03 -0.70 -0.12 1.12 -1.74 -0.77 0.95 -0.55 9 SAS 0.89 -0.33 0.01 1.12 1.09 -1.33 -1.04 -1.74 0.51 0.77 -1.36 10 NOH 0.89 -0.33 -0.42 0.04 -0.48 -0.12 -0.17 -0.46 1.02 -1.24 -1.36 11 MEM 0.89 -0.33 -0.51 -0.17 -1.37 0.60 0.26 -1.23 -0.51 -0.15 0.27 12 IND 0.89 -0.33 -0.32 -0.39 -0.48 -0.12 -0.17 -0.46 1.54 -0.88 -0.27 13 POR 0.90 0.04 -0.04 1.55 -0.25 -0.84 -1.47 -1.23 -0.51 0.40 0.36 14 OKC 0.90 0.04 -0.10 0.04 0.42 -0.60 -1.47 1.33 -0.51 0.95 -0.64 15 CHA 0.90 0.04 -0.05 -0.17 -0.25 0.36 -1.04 -0.46 -0.51 1.50 1.18 16 ATL 0.90 0.04 -0.14 -1.03 -0.25 0.60 0.69 -1.23 0.77 2.05 -0.82 17 WAS 0.91 0.40 0.61 0.26 1.09 0.60 0.26 -0.46 0.00 -0.15 1.18 18 LAC 0.91 0.40 0.43 0.47 1.09 0.12 -0.61 1.59 -1.02 -0.88 0.64 19 HOU 0.91 0.40 0.59 0.26 -0.48 0.36 0.48 1.07 1.54 0.04 0.27 20 DEN 0.91 0.40 0.29 0.04 0.20 0.12 -1.69 1.33 1.02 0.77 0.36 21 UTA 0.92 0.76 0.78 1.12 -0.70 1.57 0.48 1.07 1.02 -1.43 -0.18 22 SAC 0.92 0.76 1.04 0.04 -0.48 0.84 1.34 1.07 1.28 0.95 1.82 23 PHO 0.92 0.76 0.51 0.26 1.54 0.60 -0.17 0.31 0.00 -0.33 -0.55 24 NYK 0.92 0.76 1.01 1.77 0.42 0.12 -0.17 0.31 0.77 0.22 0.00 25 NJN 0.92 0.76 0.89 -0.39 1.09 1.08 1.12 1.59 0.00 1.13 -0.36 26 GSW 0.92 0.76 0.60 0.69 0.87 -1.08 0.48 -0.97 1.02 -0.15 1.73 27 MIN 0.93 1.13 1.01 0.26 0.87 0.36 1.12 1.33 -0.26 0.22 1.55 28 TOR 0.94 1.49 1.41 1.34 1.77 1.33 1.12 0.05 0.00 0.40 -0.82 29 DET 0.94 1.49 1.45 0.47 1.54 0.84 1.99 -0.21 -0.77 1.50 2.09 30 CLE 0.94 1.49 1.33 1.77 -0.70 1.81 1.77 0.05 0.77 0.04 -0.18

Summary

What these regression models suggest is that for the most part, efficiency — much more than play frequency — accounts for overall team efficiency. In other words, whatever plays you run or defend, the key is to run them efficiently not simply more. It's not how many post plays you run, but how efficiently you can run them. It's not how many spot up plays you generate, but how efficiently you hit those shots. And so on — at least, within the range of play frequencies that NBA teams typically run. I'm certainly not suggesting that a team could run all post plays or no post plays and still hope to compete. That's not how it works. What the data show comes as a result of years of optimization by players, coaches, and GM's of personnel and strategies. What I would suggest, however, is that the models shown here represent the current state of the NBA as of 2011. If I had access to previous years of data, my strong guess is that the models would look vastly different. Regression models are meant to explain the data that are fed to them, and should not be used to extrapolate or predict the results of parameters outside that range. This should go without saying, but I say it, nevertheless to shield myself from those obvious questions.

Post comment as
Sort: Newest | Oldest

Ok, since they say "play" they probably just treat the play after the offensive rebound as a new play but are not focusing on offensive rebounding directly. TOs are in it, at least TOs that happen after you can categorize the type of play being run.

Right. I assume "REB" plays are tip-ins and basically any play that occurs immediately without resetting the offense in the halfcourt scheme.

Does this model based on Synergy data essentially ignore the factors of turnovers and offensive rebounds? I am sure if Synergy is rolling that information into the reported PPP data or not.

This is all I know, they say, "Plays ending in FGA, TO, and FTs".

I only went with the stats I could find in the above tables... "for the offense + defense comparison". I just went with the listed / most important stats in order from the regressions earlier for the offense-offense comparison and defense comparison.

Record of better 2nd round team on their Offense + opp. Defense comparison (recognizing negative is better for the defense): PPP 3-1 SDPPP 2-2 MODEL 4-0 SPOTPPPI 4-0 POSTPPPI 2-2 TRANSPPPI 3-1 OTHERPPPI 3-1 REBPPPI 0-4 TRANSRATE 2-2

Thanks, for running this. Kind of interesting that the model outpredicted PPP, itself. Are you planning to check first round? If not, I can do it.

I wasn't planning on checking first round, so feel free. The 3rd and 4th rounds will be most interesting to me to check back on and see, especially if you want to do it . I only went with the stats I could find in the above tables.

In the first round, teams went 5-3 if they had the advantage according to the model. The only pattern I can see is that all the teams that were >2 SD units better than their opponent won (DAL, CHI, MIA, BOS). ORL was 1.91 SD unit better than ATL, but they lost. SAS was 0.8 SD units better than Memphis. DEN was 0.3 SD units better than OKC. It's only 8 matchups, so hardly enough to make a rule yet.

And in the second round, obviously no team was >2 SD units better.

But maybe I should have done crosses of them instead of like with like comparisons.

For defense: SPOTPPPI 2-2 POSTPPPI 1-3 BALLPPPI 2W- 1 tie - 1L TRANSPPPI 2-2 OTHERPPPI 1W- 1 tie - 1L ISOPPPI 2W- 1 tie - 1L REBPPPI 2-2 Looks like offensive superiority predicted the second round better. TRANSRATE 2-2

Second round W-L results of the better team in that second round match- up on the following offensive stats: SPOTPPPI 3W-1L REBPPPI 0-4 TRANSPPPI 4-0 TRANSRATE 3-1 ISOPPPI 3W-1 tie POSTPPPI 1-3 OTHERPPPI 3-1 ISORATE 2-2

Yeah, I didn't expect the public version would, but I would hope that the pro version teams use have them, or will, or that they track that themselves. When I wrote "scrabbles", I mean "scramble", but scrabbles might capture the idea of making something out of the situation fairly well, by accident.

I was waiting back a bit, but thanks for the spreadsheets and now the regressions too. I think I agree with your main conclusion but I will have to spend more time with the detail to decide what lesson(s) to take from it all. One initial thing I wonder about is the average time of shot by play / shot type and the distribution of the times. Which of these play / shots besides transition are more often in early offense and which of these is more often in late offense? Late offense often becomes a scrabbles for any kind of decent shot or even pass the hot potato to avoid hurting individual stats and taking blame. Does the Synergy data give time of shots? How much of the variation in efficiency between play / shot types might be associated with or due to time of shot? If one had the Synergy play by play database available to query one could look at various play sequencing questions, something I have been curious about for some time but without having the database or time to pursue it. There would probably be both simple and advanced techniques that could be applied to analyze this data. I wonder how much and how well teams have investigated play sequencing to date and learned from it and even sequencing of actions towards a certain type of shot outcome within single possessions. And how successful they have been to change the behavior of Coaches and players based on those findings.

The data I am using do not have those times, unfortunately.

I am guessing that "Other" includes desperation shots, judging from the abysmal results.