ezPM 1.0: Now with Play-By-Play Data!

(Note #1: If you haven't read my introduction to the ezPM model, you might want to do that first and then come back.)

(Note #2: The "1.0" modifier in the title signifies that this model is a work-in-progress, not an end in itself. This is the first version, and as much as I already like it, I like making changes improvements even more! Incremental updates will be 1.1, 1.2, etc, and major model revisions will be 2.0, 3.0, etc.)

I didn't think I would get around to incorporating play-by-play (PBP) data before the New Year, but being stuck in a blizzard has provided some extra motivation (er, down time). Also, the Ruby programming language has proven to be very useful for a task like this that involves so much text processing. (FYI: my Ruby script parses ~ 550,000 PBP results in a little over two minutes.) In short, I've been much more productive than I had anticipated...but that's a good thing, right?

Getting back to the model, recall that we want to credit (debit) players for each positive (negative) result during a game. Box score stats enable you to get a lot of data, but there are some things that are better handled by parsing PBP data:

  • assisted/unassisted field goals
  • 2pt/3pt assists
  • missed rebound opportunities
  • team rebounds/turnovers
  • defense (i.e. opponent missed field goals and free throws)
  • AND1, technicals, violations, etc.

For reference, here is the current ezPM model:

ezPM = 0.7(AFG2)+1.4(AFG3)+UFG2+2(UFG3)+0.3(AST2)+0.6(AST3)-0.7(MISS)+FTM-0.5(FTA)+AND1+0.7(ORB+0.2(TEAMORB))+0.3(DRB+0.2(TEAMDRB))-ORBFRAC*ORBM-(0.2)(0.3)(TEAMORBM)-DRBMFRAC*DRBM-(0.2)(0.7)(TEAMDRBM)+STL-TOV+OPPFTPTS-(0.2)OPP2PTM-0.4(OPP3PTM)+0.7(BLK)+0.14(OFGMISS)-0.2(TEAMTOV)+0.2(TEAMTAKE)

ezPM100 = ezPM*100/POS (ezPM standardized to 100 possessions)

The logic for each term is discussed in the previous article, but just to be sure, let me recap here:

  • ezPM Marginal points produced by player
  • 0.7(AFG2) Shooter is credited 0.7 pts for making assisted 2pt shot
  • 1.4(AFG3) Double that for assisted 3pt shot
  • UFG2 Shooter gets 1 pt for unassisted 2pt shot
  • UFG3 Double that for unassisted 3pt shot
  • 0.3(AST2) Passer gets 0.3 pts for assist to 2pt shot
  • 0.6(AST3) Double that for assist to 3pt shot
  • -0.7(MISS) Player gets debited 0.7 pts for missing any field goal.
  • FTM-0.5(FTA) Player gets 0.5 pts for each made free throw, and loses 0.5 pts for each missed free throw (Note: this expression is the same as 0.5(FTM)-0.5(FTA-FTM), in terser form)
  • AND1 Player gets 1 pt for making And1, doesn't lose anything for missing one
  • 0.7(ORB+0.2TEAMORB) Player gets credited 0.7 pts for acquiring offensive rebound and shared credit (1/5)*0.7 for team offensive rebound with his teammates
  • 0.3(DRB+0.2TEAMDRB) Same thing but for defensive rebounds
  • ORBFRAC*ORBM Player gets debited some fraction of 0.7 pts for "missing" offensive rebounds, the amount varying with position according to league averages (i.e. point guards are not debited as much as centers, and so on)
  • DRBFRAC*DRBM Same as above, but for defensive rebound misses
  • (0.2)(0.3)TEAMORBM Players share the blame for missing rebounds that are counted as team defensive rebounds by the opponent
  • (0.2)(0.7)TEAMDRBM Same as above but for missed defensive rebounds by team
  • STL Player gets a point for a steal
  • TOV Player loses a point for turnover
  • 0.2TEAMTOV Each player on a unit shares blame for team turnover
  • 0.2TEAMTAKE Same as above, except each player is credited for opponent turnover credited to team
  • 0.7BLK Player gets 0.7 pts for a block (note that this is the same amount the opponent loses by missing the shot)
  • OPPFTPTS Player loses points when his foul results in made free throws (this term will virtually always be negative)
  • -0.2OPP2PTM Each player loses 0.2 pts when opponent makes 2pt field goal
  • -0.4OPP2PTM Same as above, except for 3pt made by opponent
  • 0.14OFGMISS Each player on defensive unit is credited 1/5 of 0.7 pts for an opponent missed field goal (note that all players on a floor unit will benefit from having a defensive stopper as a teammate)

Before getting to the data, I need to express my gratitude to Aaron Barzilai who runs BasketballValue.com. Through his site, he supplies PBP and matchup data. The matchup data lists all the players in every unit during a game. Without the matchup file, in particular, this endeavor would have taken much longer, as I would have had to compile those matchups myself (and it's not a trivial matter at all). So, thanks again to Aaron, and I encourage everyone to visit that site.

Ok, now back to the show. For this post, I have crunched the 2009-10 season. My Ruby code seems to have a couple bugs that still need to be worked out, but it appears that I am getting at least 90-95% coverage of all the PBP data. (LeBron James is ranked #1, so it must be working, right?) In the table (or spreadsheet), you'll see the ezPM100 broken into offensive and defensive parts. Rankings are given for overall ezPM100, ezPM100 by position, and ezPM100 offensive and defensive parts. Wins refer to how many wins a player would produce above 41 (assuming he played every possession during a season), if he were playing with 4 teammates who each had 0.0 ezPM100.

In future posts, I will use this metric to analyze the current season, in addition to other typical questions, like who is the most improved, best rookies, MVP, etc. Of course, I will be looking more closely at the Warriors, in particular. I will also be frequently re-visiting the model assumptions (using my own judgement, and hopefully getting ideas from my readers). That's it for now, I suppose. Until next time...

(Shortcut: This is a link to a Google Spreadsheet with ALL the data used for the calculations.)

ezPM 2009-10 NBA Season

All players with at least 1392 possessions.

YEAR RANK ORANK DRANK NAME TEAM POS EZPM100 OFF100 DEF100 #POSS
2009 1 1 6 James, LeBron CLE SF 8.98 7.71 1.27 6390
2009 2 4 8 Wade, Dwyane MIA SG 6.88 6.18 0.70 5826
2009 3 2 27 Paul, Chris NOH PG 6.83 7.14 -0.31 3429
2009 4 12 13 Durant, Kevin OKC SF 5.38 4.90 0.48 6861
2009 5 9 17 Ginobili, Manu SAS SG 5.24 5.10 0.15 4675
2009 6 52 1 Howard, Dwight ORL C 4.87 3.17 1.71 5942
2009 7 21 10 Rondo, Rajon BOS PG 4.76 4.14 0.62 5953
2009 8 3 162 Nash, Steve PHX PG 4.53 6.38 -1.85 6170
2009 9 22 14 Duncan, Tim SAS C 4.46 4.11 0.35 5059
2009 10 60 2 Kirilenko, Andrei UTA SF 4.42 3.00 1.42 3431
2009 11 10 48 Gasol, Pau LAL PF 4.36 5.09 -0.73 4874
2009 12 11 52 Williams, Deron UTA PG 4.11 4.90 -0.79 6016
2009 13 6 107 Billups, Chauncey DEN PG 4.09 5.45 -1.36 5301
2009 14 77 4 Smith, Josh ATL SF 4.09 2.73 1.36 5882
2009 15 76 5 Wallace, Gerald CHA SF 4.04 2.73 1.31 5922
2009 16 5 121 Roy, Brandon POR SG 4.00 5.48 -1.48 4891
2009 17 36 20 Bryant, Kobe LAL SG 3.76 3.68 0.08 5898
2009 18 8 150 Bosh, Chris TOR PF 3.40 5.14 -1.74 5374
2009 19 115 3 Camby, Marcus LAC C 3.37 1.99 1.38 3218
2009 20 32 38 Miller, Andre POR PG 3.31 3.86 -0.55 4942
2009 21 86 11 Kidd, Jason DAL PG 3.13 2.55 0.58 5889
2009 22 20 81 Lowry, Kyle HOU PG 3.03 4.17 -1.15 3406
2009 23 38 47 Evans, Tyreke SAC PG 2.88 3.59 -0.71 5254
2009 24 7 217 Landry, Carl HOU PF 2.85 5.30 -2.44 2927
2009 25 105 9 Bogut, Andrew MIL C 2.84 2.18 0.66 4225
2009 26 37 62 Batum, Nicolas POR SF 2.75 3.61 -0.86 1873
2009 27 30 92 Salmons, John MIL SG 2.69 3.94 -1.24 2067
2009 28 33 83 Carter, Vince ORL SG 2.61 3.77 -1.16 4759
2009 29 31 98 Johnson, Joe ATL SG 2.61 3.88 -1.27 5962
2009 30 13 191 Ridnour, Luke MIL PG 2.61 4.73 -2.12 3333
2009 31 14 176 Lawson, Ty DEN PG 2.51 4.51 -2.00 2800
2009 32 49 54 Pierce, Paul BOS SF 2.46 3.25 -0.79 4817
2009 33 96 18 Wallace, Ben DET C 2.42 2.29 0.13 3735
2009 34 27 134 Anthony, Carmelo DEN SF 2.42 4.01 -1.59 5636
2009 35 123 12 Camby, Marcus POR C 2.38 1.87 0.51 1509
2009 36 35 105 Nelson, Jameer ORL PG 2.37 3.73 -1.35 3763
2009 37 18 167 Crawford, Jamal ATL PG 2.36 4.27 -1.91 5021
2009 38 71 35 Westbrook, Russell OKC PG 2.31 2.78 -0.47 5920
2009 39 87 24 Felton, Raymond CHA PG 2.26 2.54 -0.28 5094
2009 40 73 37 Granger, Danny IND SF 2.22 2.75 -0.53 4775
2009 41 46 87 Rose, Derrick CHI PG 2.22 3.41 -1.19 5740
2009 42 23 168 Love, Kevin MIN PF 2.19 4.10 -1.91 3366
2009 43 62 50 Williams, Jason ORL PG 2.17 2.96 -0.78 3602
2009 44 58 66 Boozer, Carlos UTA PF 2.12 3.03 -0.91 5655
2009 45 113 19 Price, A.J. IND PG 2.11 2.01 0.10 1650
2009 46 43 109 Horford, Al ATL C 2.11 3.50 -1.39 5866
2009 47 93 28 Andersen, Chris DEN C 2.07 2.39 -0.33 3702
2009 48 44 119 Williams, Louis PHI SG 2.00 3.47 -1.46 3897
2009 49 48 102 Mohammed, Nazr CHA C 2.00 3.32 -1.32 1934
2009 50 47 110 Bynum, Andrew LAL C 1.98 3.38 -1.39 4122
2009 51 50 100 Haywood, Brendan WAS C 1.92 3.23 -1.31 3136
2009 52 66 67 Varejao, Anderson CLE PF 1.92 2.86 -0.94 4520
2009 53 121 22 West, Delonte CLE SG 1.89 1.93 -0.05 3002
2009 54 15 226 Calderon, Jose TOR PG 1.87 4.41 -2.54 3856
2009 55 92 41 Watson, C.J. GSW PG 1.86 2.47 -0.61 3815
2009 56 39 151 Udrih, Beno SAC PG 1.83 3.59 -1.76 4928
2009 57 45 136 Nowitzki, Dirk DAL PF 1.80 3.43 -1.63 6223
2009 58 133 21 Garnett, Kevin BOS PF 1.77 1.76 0.01 4148
2009 59 63 85 Davis, Baron LAC PG 1.76 2.93 -1.17 5004
2009 60 25 203 Redick, J.J. ORL SG 1.75 4.08 -2.33 3616
2009 61 55 108 Parker, Tony SAS PG 1.68 3.06 -1.38 3560
2009 62 28 207 Randolph, Zach MEM PF 1.64 3.98 -2.34 6387
2009 63 218 7 Allen, Tony BOS SG 1.62 0.76 0.86 1610
2009 64 69 89 Noah, Joakim CHI C 1.61 2.80 -1.20 3744
2009 65 34 192 Lee, David NYK C 1.61 3.75 -2.14 6157
2009 66 54 120 Hill, George SAS PG 1.60 3.08 -1.48 4596
2009 67 16 240 Douglas, Toney NYK PG 1.57 4.36 -2.78 2110
2009 68 40 178 Gasol, Marc MEM C 1.57 3.57 -2.01 5099
2009 69 26 228 Johnson, Amir TOR PF 1.51 4.07 -2.56 2976
2009 70 29 213 Jack, Jarrett TOR PG 1.51 3.94 -2.43 4744
2009 71 168 16 Moon, Jamario CLE SF 1.51 1.35 0.16 2069
2009 72 83 79 Richardson, Jason PHX SG 1.47 2.59 -1.12 5669
2009 73 42 188 Robinson, Nate NYK PG 1.45 3.51 -2.06 1470
2009 74 19 245 Stoudemire, Amare PHX PF 1.41 4.24 -2.83 6439
2009 75 103 56 Maynor, Eric OKC PG 1.37 2.20 -0.83 1679
2009 76 61 140 Hilario, Nene DEN C 1.34 2.98 -1.64 5875
2009 77 53 153 Williams, Mo CLE PG 1.33 3.11 -1.78 4831
2009 78 106 61 Ilyasova, Ersan MIL SF 1.30 2.16 -0.86 3660
2009 79 67 128 Blair, DeJuan SAS PF 1.29 2.83 -1.55 2959
2009 80 17 263 Maggette, Corey GSW SF 1.25 4.35 -3.11 4613
2009 81 141 34 Odom, Lamar LAL PF 1.24 1.70 -0.46 5128
2009 82 65 143 Terry, Jason DAL SG 1.19 2.87 -1.67 5139
2009 83 140 36 Marion, Shawn DAL SF 1.18 1.70 -0.52 4882
2009 84 158 25 Deng, Luol CHI SF 1.18 1.47 -0.29 5331
2009 85 97 77 Korver, Kyle UTA SF 1.16 2.27 -1.11 1974
2009 86 91 106 Curry, Stephen GSW PG 1.13 2.48 -1.36 6360
2009 87 108 74 Murphy, Troy IND PF 1.12 2.15 -1.03 4731
2009 88 74 137 Arroyo, Carlos MIA PG 1.11 2.74 -1.63 2903
2009 89 79 132 Anderson, Ryan ORL PF 1.11 2.69 -1.58 1892
2009 90 138 40 Iguodala, Andre PHI SF 1.10 1.71 -0.61 6262
2009 91 88 115 Arenas, Gilbert WAS SG 1.09 2.52 -1.43 2302
2009 92 51 194 Dragic, Goran PHX SG 1.06 3.23 -2.16 2935
2009 93 80 139 Collison, Nick OKC C 1.05 2.69 -1.64 3203
2009 94 143 42 Richardson, Quentin MIA SG 1.04 1.68 -0.64 4234
2009 95 167 29 Jennings, Brandon MIL PG 1.02 1.38 -0.36 5126
2009 96 184 23 Fernandez, Rudy POR SG 1.02 1.14 -0.12 2726
2009 97 124 68 Dalembert, Samuel PHI C 0.90 1.86 -0.96 4066
2009 98 111 88 Ford, T.J. IND PG 0.90 2.08 -1.19 2326
2009 99 134 60 Barnes, Matt ORL SF 0.88 1.74 -0.86 4431
2009 100 75 163 Millsap, Paul UTA PF 0.87 2.74 -1.87 4735
2009 101 24 271 Lopez, Robin PHX C 0.82 4.09 -3.27 2162
2009 102 70 177 Allen, Ray BOS SG 0.79 2.79 -2.00 5691
2009 103 238 15 Delfino, Carlos MIL SG 0.76 0.45 0.32 4279
2009 104 56 202 Thornton, Marcus NOH SG 0.76 3.06 -2.29 3637
2009 105 187 31 Harden, James OKC SG 0.74 1.12 -0.37 3392
2009 106 101 124 Gibson, Daniel CLE PG 0.71 2.21 -1.50 2138
2009 107 104 122 Miller, Mike WAS SG 0.69 2.18 -1.49 3502
2009 108 81 181 Collison, Darren NOH PG 0.65 2.67 -2.02 4098
2009 109 157 58 Stuckey, Rodney DET PG 0.64 1.47 -0.83 4777
2009 110 72 193 Aldridge, LaMarcus POR PF 0.62 2.77 -2.15 5701
2009 111 119 113 Ibaka, Serge OKC PF 0.55 1.95 -1.40 2446
2009 112 120 112 Gay, Rudy MEM SF 0.54 1.94 -1.40 6620
2009 113 59 224 Brooks, Aaron HOU PG 0.53 3.02 -2.49 5959
2009 114 202 33 Farmar, Jordan LAL PG 0.52 0.98 -0.46 2768
2009 115 165 65 Williams, Marvin ATL SF 0.49 1.39 -0.90 4938
2009 116 57 232 Martin, Kevin HOU SG 0.43 3.04 -2.61 1814
2009 117 137 101 Hill, Grant PHX SF 0.40 1.71 -1.31 5424
2009 118 197 44 Brewer, Ronnie UTA SF 0.40 1.06 -0.66 3457
2009 119 107 152 Amundson, Louis PHX PF 0.39 2.16 -1.77 2366
2009 120 144 96 Wilkins, Damien MIN SG 0.39 1.66 -1.27 3018
2009 121 78 209 Prince, Tayshaun DET SF 0.34 2.72 -2.38 3226
2009 122 221 30 Artest, Ron LAL SF 0.33 0.69 -0.36 5237
2009 123 85 201 Barea, Jose DAL PG 0.28 2.56 -2.29 3021
2009 124 100 175 Bonner, Matt SAS PF 0.25 2.23 -1.98 2463
2009 125 126 135 Gallinari, Danilo NYK SF 0.24 1.86 -1.61 5488
2009 126 208 45 House, Eddie BOS PG 0.22 0.93 -0.70 1542
2009 127 112 159 Matthews, Wes UTA SG 0.19 2.04 -1.84 4202
2009 128 178 72 Gortat, Marcin ORL C 0.18 1.20 -1.01 2089
2009 129 195 64 Watson, Earl IND PG 0.18 1.07 -0.89 4729
2009 130 147 123 Randolph, Anthony GSW C 0.15 1.65 -1.50 1583
2009 131 210 51 Brown, Shannon LAL SG 0.10 0.88 -0.79 3075
2009 132 154 111 Okafor, Emeka NOH C 0.09 1.49 -1.40 4778
2009 133 142 138 Dampier, Erick DAL C 0.06 1.70 -1.64 2630
2009 134 185 76 Jefferson, Richard SAS SF 0.04 1.14 -1.10 5162
2009 135 181 80 Dooling, Keyon NJN PG 0.02 1.16 -1.14 1763
2009 136 213 59 Green, Jeff OKC SF -0.00 0.84 -0.85 6235
2009 137 152 129 Bibby, Mike ATL PG -0.02 1.53 -1.55 4518
2009 138 131 154 Sessions, Ramon MIN PG -0.04 1.78 -1.82 3225
2009 139 234 39 Jackson, Stephen CHA SF -0.06 0.52 -0.57 5513
2009 140 217 55 Perkins, Kendrick BOS C -0.06 0.76 -0.83 4343
2009 141 118 187 Conley, Mike MEM PG -0.10 1.96 -2.06 5370
2009 142 109 199 Gooden, Drew DAL C -0.10 2.14 -2.24 2058
2009 143 132 164 Scola, Luis HOU PF -0.11 1.77 -1.87 5442
2009 144 164 126 Blake, Steve POR PG -0.12 1.39 -1.51 2695
2009 145 211 73 Parker, Anthony CLE SG -0.15 0.88 -1.03 4852
2009 146 153 145 Barbosa, Leandro PHX SG -0.17 1.52 -1.69 1690
2009 147 64 260 Lopez, Brook NJN C -0.21 2.89 -3.10 5577
2009 148 95 227 Morrow, Anthony GSW SF -0.22 2.33 -2.55 4369
2009 149 136 174 Augustin, D.J. CHA PG -0.23 1.72 -1.96 2672
2009 150 193 103 Battier, Shane HOU SF -0.24 1.08 -1.33 4316
2009 151 162 142 Diaw, Boris CHA SF -0.25 1.42 -1.67 5602
2009 152 243 43 Chalmers, Mario MIA PG -0.26 0.38 -0.64 3375
2009 153 225 69 Smith, J.R. DEN SG -0.33 0.66 -0.99 4398
2009 154 150 173 Thabeet, Hasheem MEM C -0.34 1.61 -1.95 1721
2009 155 89 247 West, David NOH PF -0.34 2.51 -2.86 5951
2009 156 82 253 Bynum, Will DET PG -0.36 2.61 -2.97 3137
2009 157 194 117 Chandler, Tyson CHA C -0.36 1.08 -1.44 2141
2009 158 241 49 Pietrus, Mickael ORL SF -0.37 0.41 -0.78 3321
2009 159 248 46 Salmons, John CHI SF -0.38 0.33 -0.70 3214
2009 160 242 53 Butler, Caron DAL SF -0.38 0.40 -0.79 1791
2009 161 199 116 Graham, Stephen CHA SG -0.39 1.04 -1.43 1392
2009 162 90 249 Foye, Randy WAS PG -0.39 2.51 -2.90 3008
2009 163 129 198 Mayo, O.J. MEM SG -0.40 1.81 -2.21 6464
2009 164 176 141 Ellis, Monta GSW SG -0.43 1.22 -1.65 5709
2009 165 68 273 Smith, Craig LAC PF -0.47 2.82 -3.29 2253
2009 166 94 248 Martin, Kevin SAC SG -0.52 2.35 -2.87 1550
2009 167 219 97 Webster, Martell POR SF -0.53 0.74 -1.27 3816
2009 168 172 157 O'Neal, Jermaine MIA C -0.54 1.29 -1.83 4016
2009 169 247 63 McDyess, Antonio SAS C -0.55 0.33 -0.88 3284
2009 170 230 78 Casspi, Omri SAC SF -0.56 0.55 -1.11 3874
2009 171 102 242 Boykins, Earl WAS PG -0.58 2.21 -2.79 2025
2009 172 160 186 Haslem, Udonis MIA C -0.59 1.46 -2.06 4138
2009 173 190 147 Wright, Dorell MIA PF -0.60 1.09 -1.69 2740
2009 174 231 82 Price, Ronnie UTA PG -0.61 0.54 -1.15 1539
2009 175 191 149 Daniels, Marquis BOS SG -0.63 1.09 -1.72 1805
2009 176 277 26 Sefolosha, Thabo OKC SF -0.65 -0.35 -0.29 4780
2009 177 130 216 Chandler, Wilson NYK SF -0.66 1.78 -2.44 4613
2009 178 177 165 Budinger, Chase HOU SF -0.66 1.22 -1.88 3064
2009 179 214 127 Iverson, Allen PHI PG -0.68 0.84 -1.52 1577
2009 180 99 251 Bayless, Jerryd POR PG -0.68 2.25 -2.93 2449
2009 181 212 133 Jeffries, Jared NYK SF -0.72 0.87 -1.59 2922
2009 182 98 255 Evans, Maurice ATL SF -0.72 2.26 -2.98 2600
2009 183 205 144 Lee, Courtney NJN SG -0.72 0.96 -1.68 4482
2009 184 135 218 Jefferson, Al MIN C -0.73 1.73 -2.45 4703
2009 185 276 32 Ariza, Trevor HOU SF -0.74 -0.35 -0.39 5423
2009 186 239 86 Carter, Anthony DEN PG -0.75 0.43 -1.18 1620
2009 187 148 210 Turkoglu, Hedo TOR SF -0.76 1.62 -2.39 4794
2009 188 161 197 Duhon, Chris NYK SG -0.76 1.44 -2.20 4231
2009 189 200 158 Belinelli, Marco TOR SG -0.80 1.03 -1.83 2315
2009 190 41 294 Gooden, Drew LAC C -0.82 3.52 -4.33 1435
2009 191 265 57 Udoka, Ime SAC SF -0.83 0.00 -0.83 1819
2009 192 173 190 Mbah a Moute, Luc MIL PF -0.83 1.28 -2.11 3384
2009 193 244 93 Dunleavy, Mike IND SF -0.86 0.38 -1.25 3063
2009 194 222 131 Blatche, Andray WAS C -0.89 0.69 -1.58 4243
2009 195 235 114 Carney, Rodney PHI SF -0.92 0.50 -1.42 1573
2009 196 251 91 Head, Luther IND SG -0.93 0.31 -1.24 1603
2009 197 203 169 Holiday, Jrue PHI PG -0.94 0.97 -1.91 3403
2009 198 198 179 Young, Thaddeus PHI SF -0.95 1.06 -2.01 4212
2009 199 139 238 Harris, Devin NJN PG -0.97 1.71 -2.67 4212
2009 200 84 282 Walker, Bill NYK SF -0.97 2.58 -3.55 1576
2009 201 233 125 Hinrich, Kirk CHI SG -0.98 0.52 -1.50 4913
2009 202 170 205 Gordon, Eric LAC SG -0.99 1.34 -2.33 4514
2009 203 188 189 O'Neal, Shaquille CLE C -1.00 1.10 -2.10 2521
2009 204 250 99 Fisher, Derek LAL PG -1.00 0.31 -1.31 4603
2009 205 224 148 Hayes, Chuck HOU C -1.01 0.69 -1.70 3622
2009 206 259 95 Butler, Caron WAS SF -1.10 0.17 -1.27 3574
2009 207 149 239 Afflalo, Arron DEN SG -1.11 1.62 -2.73 4572
2009 208 117 261 Dudley, Jared PHX PF -1.13 1.97 -3.10 4390
2009 209 215 172 Biedrins, Andris GSW C -1.13 0.82 -1.94 1590
2009 210 169 222 Boone, Josh NJN C -1.13 1.34 -2.48 1792
2009 211 146 243 Maxiell, Jason DET PF -1.14 1.65 -2.79 2961
2009 212 179 206 McGee, JaVale WAS C -1.15 1.19 -2.33 1715
2009 213 192 200 Gibson, Taj CHI PF -1.15 1.09 -2.24 4296
2009 214 264 84 Murray, Ronald (Flip) CHA SG -1.16 0.01 -1.17 1916
2009 215 262 104 Jamison, Antawn CLE PF -1.18 0.15 -1.33 1739
2009 216 186 204 Thompson, Jason SAC PF -1.20 1.13 -2.33 4710
2009 217 254 118 Wright, Julian NOH SF -1.20 0.25 -1.45 1658
2009 218 116 269 Jerebko, Jonas DET PF -1.26 1.98 -3.25 4211
2009 219 209 195 Anthony, Joel MIA C -1.27 0.92 -2.19 2413
2009 220 110 276 Green, Willie PHI SG -1.28 2.11 -3.39 2939
2009 221 128 264 Blake, Steve LAC PG -1.28 1.83 -3.11 1452
2009 222 252 130 Mason, Roger SAS SG -1.29 0.29 -1.58 2877
2009 223 175 225 Weems, Sonny TOR SG -1.29 1.23 -2.52 2801
2009 224 220 182 Lewis, Rashard ORL PF -1.29 0.73 -2.02 4857
2009 225 183 219 Hickson, J.J. CLE C -1.31 1.15 -2.46 3518
2009 226 266 90 Martin, Kenyon DEN PF -1.32 -0.08 -1.23 4183
2009 227 229 170 Posey, James NOH SG -1.35 0.58 -1.93 3294
2009 228 166 241 Frye, Channing PHX PF -1.40 1.39 -2.79 5061
2009 229 125 272 Derozan, DeMar TOR SG -1.42 1.86 -3.28 3385
2009 230 163 246 Young, Sam MEM SF -1.43 1.41 -2.84 2472
2009 231 280 71 Rush, Brandon IND SG -1.45 -0.45 -1.00 4875
2009 232 145 265 Warrick, Hakim MIL PF -1.49 1.66 -3.15 1959
2009 233 196 230 George, Devean GSW SF -1.50 1.06 -2.56 1689
2009 234 182 236 Brand, Elton PHI PF -1.50 1.15 -2.66 4557
2009 235 189 233 Okur, Mehmet UTA C -1.51 1.10 -2.61 4493
2009 236 159 254 Pachulia, Zaza ATL C -1.51 1.47 -2.98 2192
2009 237 253 161 Bogans, Keith SAS SG -1.56 0.29 -1.85 3086
2009 238 122 281 Landry, Carl SAC PF -1.61 1.92 -3.53 1960
2009 239 151 268 Hibbert, Roy IND C -1.62 1.59 -3.22 4124
2009 240 223 208 Stojakovic, Peja NOH SF -1.66 0.69 -2.34 3931
2009 241 261 155 Miles, C.J. UTA SG -1.68 0.15 -1.83 3129
2009 242 258 166 Douglas-Roberts, Chris NJN SG -1.69 0.20 -1.89 3117
2009 243 174 258 Jamison, Antawn WAS PF -1.73 1.27 -3.00 3132
2009 244 127 284 Gordon, Ben DET SG -1.74 1.83 -3.58 3351
2009 245 288 75 Hughes, Larry NYK SF -1.87 -0.83 -1.05 1698
2009 246 249 196 Miller, Brad CHI C -1.89 0.31 -2.20 3931
2009 247 236 212 Turiaf, Ronny GSW C -1.94 0.48 -2.42 1933
2009 248 289 70 Stackhouse, Jerry MIL SF -1.95 -0.96 -0.99 1573
2009 249 286 94 Williams, Marcus MEM PG -1.96 -0.71 -1.25 1637
2009 250 240 214 Ellington, Wayne MIN SG -2.02 0.41 -2.44 2496
2009 251 267 171 Beasley, Michael MIA PF -2.03 -0.09 -1.94 4526
2009 252 228 237 Gomes, Ryan MIN SF -2.08 0.58 -2.66 4266
2009 253 155 283 Thornton, Al LAC SF -2.08 1.48 -3.56 2679
2009 254 207 259 Daye, Austin DET SF -2.11 0.93 -3.04 1657
2009 255 237 235 Flynn, Jonny MIN PG -2.17 0.47 -2.64 4455
2009 256 180 275 Graham, Joey DEN SF -2.20 1.18 -3.38 1423
2009 257 114 293 Williams, Reggie GSW SG -2.22 2.00 -4.22 1723
2009 258 245 231 Jones, Dahntay IND SG -2.22 0.35 -2.57 3624
2009 259 257 220 Humphries, Kris NJN PF -2.26 0.21 -2.47 1662
2009 260 279 160 Bell, Charlie MIL SG -2.26 -0.42 -1.84 3059
2009 261 274 183 Greene, Donte SAC SF -2.34 -0.30 -2.04 3251
2009 262 275 184 Brown, Devin NOH SG -2.35 -0.30 -2.05 1895
2009 263 201 278 Harrington, Al NYK PF -2.45 0.99 -3.45 4496
2009 264 263 229 Ilgauskas, Zydrunas CLE C -2.46 0.10 -2.56 2771
2009 265 216 274 Krstic, Nenad OKC C -2.51 0.78 -3.29 3627
2009 266 270 215 Butler, Rasual LAC SF -2.54 -0.10 -2.44 5335
2009 267 206 280 Villanueva, Charlie DET PF -2.56 0.94 -3.50 3547
2009 268 227 270 Bargnani, Andrea TOR PF -2.67 0.58 -3.25 5969
2009 269 290 146 Williams, Terrence NJN SF -2.67 -0.98 -1.69 3127
2009 270 284 180 Thomas, Kurt MIL PF -2.68 -0.67 -2.01 2026
2009 271 256 257 Young, Nick WAS SG -2.78 0.21 -2.99 2726
2009 272 291 156 Wallace, Rasheed BOS C -2.84 -1.01 -1.83 3460
2009 273 281 211 Howard, Josh DAL SF -2.90 -0.50 -2.40 1715
2009 274 255 267 Hamilton, Richard DET SG -3.00 0.21 -3.21 3021
2009 275 269 256 Carroll, DeMarre MEM SF -3.08 -0.09 -2.98 1544
2009 276 285 223 Kaman, Chris LAC C -3.16 -0.69 -2.48 5121
2009 277 226 288 Tolliver, Anthony GSW C -3.17 0.64 -3.82 3112
2009 278 260 277 Howard, Juwan POR PF -3.27 0.16 -3.43 2998
2009 279 273 262 Wright, Antoine TOR SF -3.34 -0.23 -3.11 2910
2009 280 278 252 Andersen, David HOU C -3.35 -0.41 -2.94 1789
2009 281 287 234 Nocioni, Andres SAC SF -3.36 -0.74 -2.62 3066
2009 282 232 291 Jordan, DeAndre LAC C -3.49 0.52 -4.02 2112
2009 283 283 250 Peterson, Morris NOH SG -3.50 -0.59 -2.90 1988
2009 284 268 279 Hawes, Spencer SAC C -3.58 -0.09 -3.49 3792
2009 285 204 295 Davis, Glen BOS C -3.65 0.96 -4.60 1751
2009 286 295 221 Brewer, Corey MIN SF -3.75 -1.28 -2.47 4792
2009 287 156 297 Speights, Marreese PHI C -3.91 1.47 -5.38 1945
2009 288 171 296 Hunter, Chris GSW C -4.02 1.30 -5.32 1733
2009 289 294 244 Radmanovic, Vladimir GSW SF -4.07 -1.27 -2.80 1577
2009 290 271 290 Kapono, Jason PHI SG -4.12 -0.14 -3.98 1894
2009 291 293 266 Jianlian, Yi NJN PF -4.16 -1.01 -3.15 2962
2009 292 272 292 Williams, Jawad CLE SF -4.31 -0.16 -4.16 1430
2009 293 297 185 Pargo, Jannero CHI PG -4.47 -2.41 -2.05 1492
2009 294 282 289 Hassell, Trenton NJN SF -4.50 -0.52 -3.97 1863
2009 295 292 286 Hayes, Jarvis NJN SF -4.72 -1.01 -3.71 1861
2009 296 296 287 Songaila, Darius NOH C -5.42 -1.62 -3.80 2657
2009 297 246 298 Hollins, Ryan MIN C -6.67 0.34 -7.01 2182
2009 298 298 285 Pavlovic, Sasha MIN SF -7.63 -4.02 -3.61 1540

 

58 thoughts on “ezPM 1.0: Now with Play-By-Play Data!”

  1. This is pretty awesome stuff. It seems to pass the eyeball test - the players we generally perceive as being good definitely float to the top. There are some that would be surprised by where people ended up - for instance, Kobe being behind Gerald Wallace and AK47, and Monta being net negative and Curry being only 1+ per 100 possessions. Looking forward to seeing updates on this.

    Also wanted to point out that the spreadsheet very nearly foots - total plus minus comes to positive 1266 which is just 3+ per player over the entire 2009-10 season, which is very close to balancing out.

  2. Thanks for the work and the release.

    A few notes from one perspective, if interested:

    Average position designation of the top 20 and top 50 on EZPM100 (by the system that includes a decimal place) are at 2.95 and 2.89 (just short of SF). Top 100 is at about 2.76. Just a slight lean at the top toward perimeter players on the overall measure.

    Bottom 50 clearly leans towards bigs with an average position designation of 3.8. Bottom 100 is 3.6.

    Average position designation on Off100- for top 20, top 50 and top100 are all close to 2.7. Bottom 50 3.46, bottom 100 3.36.

    Average position designation on Def100- for top 20 is 3.1 but top 50 is 2.7 and top 100 is 2.6. Bottom 50 3.46, bottom 100 3.64.

  3. A lot of other stuff can be done, including comparisons of various metrics.

    One simple and admittedly very tiny comparison I did of Def100 and "Defensive Rating" just for the main rotation of one team (Thunder) showed the 2 to have very different ratings with only a .21 correlation. Using play by play based shot defense for when you are on the court made a notable difference compared to average shot defense for the whole team and all minutes, whether you were on the court or not. At least it that small sample. A full league comparison would be worthwhile.

    Assigning some additional responsibility for good and bad defense when your assumed position counterpart makes the decisive final play instead of dividing it equally among all on the court would change things again, but you have to want to make that difference mark and see what it says. It is still only a flawed estimate but maybe it has additional value.

    1. Just to be clear, by "Defensive Rating", are you referring to the Win Shares metric? Also, is that for 2009 data?

      As a sanity check, I'm now trying to compare the ezPM to +/- data for the top units on each team last season. It looks good so far, but I want to do a WLS regression to account for large variation in possessions. I assumed you looked at OKC because it had the top unit in terms of possessions (2522). The Warriors had the least used leading team unit with only 212 possessions. Given the disparity, a WLS seems appropriate to me.

  4. The correlation of Def100 and "Defensive Rating" (which is Oliver's metric and yes used in WinShares) for part of one team was just a quick check to suggest an area for further research. I looked at OKC because I follow them but I would not claim statistical significance off that size sample.

    Ideally it would be desirable to compare the ezPM to +/- data for a 2 to 4 year period for greater stability in many ways and thus a better comparison and closer to true talent.

    The exPM data and the comparison with Adjusted +/- will probably be affected and probably significantly by the number of starters faced or number of good players faced. And the same with teammates. Adjusted +/- has its method for accounting for that. Your model could handle it in different variations but here is a general approach: Label the play by play data with the actual play by play count of starters faced and played with, and compile that information both at individual and league level. You could see at individual, team and league levels how much that affects performance numbers split by various levels of starters with and faced. You could then try to apply an appropriate team and / or league level data-based adjustment(s) to the individual performance data (offense and defense) for this variation in context(s) .

    Perhaps the presence of a starter counterpart should get a .2 or .3 weight (out of the 1.0 total among the 9 others on the court) while the 4 other opponents get a .075 or .05 rather than give each a .1 weight since facing a starter counterpart probably has more impact on individual offensive and defensive performance than the presence of them at other positions. You could try different weight sets and see which produced the best general correlation between ezPM and Adjusted +/-.

  5. In your previous post you talked about the assumption of 1.0 PPP. I guess I am talking about the same topic though more about the way to find the appropriate adjustment. Your way might be the simplest to actually do it in the formula but the research I suggest might help you set the right PPP adjustment levels.

    1. As always, appreciate your input. Just for quick reference, the regression equation I get for the model (+/- data from bball-value):

      Adj+/- = 0.863*ezPM(team) - 1.69 (R^2=0.486)

      Unadj+/- = 0.837*ezPM(team) + 0.58 (R^2=0.455)

      Haven't tested for significance yet, but I am fairly confident judging by the plots the p-value will be low. Need to throw all this into R.

      1. Would you consider displaying the correlations for perimeters and big men, and depending on the results, maybe using different equations?

        1. The correlation between ezPM and Adjusted +/- for just perimeters and just big men in contrast to the correlation between these metrics for everyone all at once. If the correlations are notably different maybe perimeter and big men splits should have different weights on EZPM in your equation and / or different constants.

    2. Ah, and I should point out that is not weighted. As a quick and dirty way of doing this, if I only look at the teams with above average possessions in the top unit (ATL, BOS, DEN, HOU, LAL, NOH, OKC, ORL, PHO - ranging from 934 to 2522 possessions), I get the following:

      Adj+/- = 1.02*ezPM -3.14 (R^2 = 0.73)

      Unadj+/- = 0.942*ezPM+0.35 (R^2=0.79)

        1. I just dropped the teams that had top units with below average # possessions, the logic being that those units have much more error.

    3. One thought I'm having is to set up a number of coefficients as parameters and perform a global optimization, perhaps, using simulated annealing, which I have some experience with. It's sort of like an inverse way of approaching the +/- data using advanced stats.

        1. Me too, time willing. ;)

          The basic idea would be to setup a cost function involving ezPM and Adj+/-, and treating the various weights in the model as variable parameters to optimize (minimize) the cost function.

        2. Sound good. I've always been interested in looking deeply for a better or ideal blend of Adjusted +/- and a Boxscore based Statistical Model. I've tried 50-50 and 1:2. The inclusion of shot defense in your model in my mind makes it the probable best yet Boxscore based Statistical Model framework... with perhaps room to get even better with some sort of context adjustment.

        3. Actually I tried other mixes besides 50/50 and 1:2. Dan Rosenbaum's "Overall +/-" essentially used 4 parts "Statistical" and 1 part Adjusted +/- and I replicated that sometimes after he stopped talking about it. This was before multi-season or regularized Adjusted +/- and their lowering of estimated average errors. I'd be more comfortable these days with a higher share of Adjusted +/- in a blended "overall" metric as long as it was multi-season based and / or regularized.

  6. Yes the Def100 and “Defensive Rating was for 2009-10 data. I didn't do a significance test as it was just a quick check. I just found the simple correlation. At this point it is merely suggestive. More could be done.

  7. Back to my comments about the average position of the top and bottom ezPM rankings, I'd add what was implied-

    that at the top the overall ezPM rating is pretty position neutral (likely in contrast, even sharp contrast to other widely used / discussed models)

    that different positions vary in their average offensive and defensive contributions (because of rebounding and interior position generally increases shot defense responsibility)

    and that the greater presence of big men at the bottom of the overall ratings confirms that the relative depth of talent in the league declines as you go taller.

    1. Here's what I'm getting:

      All positions:
      2yr Adj+/- = 0.76*ezPM - 0.39 (R^2=0.2522)

      PG:
      2yr Adj+/- = 1.03*ezPM -2.19 (R^2=0.37)

      SG:
      2yr Adj+/- = 0.92*ezPM + 0.09 (R^2=0.324)

      SF:
      2yr Adj+/- = 0.79*ezPM + 0.42 (R^2=0.403)

      PF:
      2yr Adj+/- = 0.666*ezPM + 0.081 (R^2=0.1475)

      C:
      2yr Adj+/- = 0.774*ezPM - 0.774 (R^2=0.224)

      Crow, you definitely appear to be correct that PF/C don't match up nearly as well.

  8. Evan, would you be interested in sharing the Ruby script? I'm not a great scripter (more of a math guy). I should probably teach myself, though.

    You're from GT? At my job I work quite a bit with GTSTRUDL, which was developed there at GT.

      1. Biomechanics? Sounds scary.

        I'm a PE in structural engineering; I design bridges for my job. My M.S. thesis was on polymer fiber reinforcement of concrete.

        1. I'm mainly interested in cell mechanics. If it helps, you can think of a cell as a polymer (gel) reinforced with stiff cables. That's actually a model myself and a colleague are working on. He's doing more of the computational stuff, I'm mainly an experimentalist.

        2. That's pretty cool. What is the objective of the modeling? To understand the structural/mechanical physics of the cell's interactions with the outside environment? I must say, that sounds fairly complicated. Is the modeling FEM-based?

  9. Questions:

    Does the -2.19 in the formula for PGs (almost 3 times the constant of any other position) concern you that maybe the assist weight is too high or too high specifically for PG?

    What if the individual credit for defensive rebounds went to .4 or . 5 and the split credit for an opponent miss was reduced to offset? Would the weight in front of EZPM for interior players return closer to 1? Is that useful / worth doing?

    1. Sure, but give me a few weeks to clean it up a bit and do some more validation. I'd actually like to make this open source, if people are interested (I guess even if they are not).

    2. Crow, to answer your question, it doesn't really bother me. I think having higher R^2 and a coefficient close to 1.0 is much more important. I'm going to run a multiple regression using position as a dummy variable. Seems to me like that is a more rigorous way to compare ezPM to Adj. +/-. Would you agree?

      I have to admit I'm not at all a statistics guy. That's one of the reasons I wanted to do this kind of "direct" model, because I can wrap my head around it a little better. In fact, I like to use "ez" not only because it is my initials, but because I think it is also relatively "easy" compared to things like Adj. +/- and DSMok1's ASPM (which is awesome, btw, just a bit hard to grasp for me).

      1. I am not highly trained in statistics. I try to use them responsibly and think about ways to use them even more creatively and responsibly.

  10. The PG constant could also be picking up PG weakness on defensive switches due to size. likely higher than for any other position. The 2nd highest weakness in the C constant could be the combination of size advantage and speed weakness. Giving everyone equal credit for opponent missed shots is an understandable simplification but may not match true impact as closely as a more nuanced split- which might be supported by the data and interpretation of the data.

  11. Trying to adjust positions on average for missed opponent shots (or anything else) will advantage and disadvantage certain individuals who are not position typical on defense. For that reason I think moving to some larger share of the total credit for an opponent miss going to the position counterpart defender (and less to the rest of players) might improve things as stated before.

    But if you don't do that, maybe instead of equal 20% shares for an opponent miss maybe something like 16% to the PG, 18% to SG, 20% to SF, 22% to PF and 24% to C. might be worth a look.

    1. If I ever make a PbP metric, I think I'll do something like the latter, but superimpose some additional malus to the player counterpart.

    2. Yeah, like I said, this is 1.0.

      So, my thoughts in terms of optimization are that any time you start to propose ad hoc numbers (and I know I've already done that to some extent), you might as well head for something like simulated annealing or genetic algorithms, and just let it crunch. We can add a parameter to weight the position counterpart. I've also thought about creating a "defense distribution" function which would be sort of like a Gaussian above each player (obviously the position counterpart+equal weights is a very sharp approximation of that).

      For example, if we take the PG, he would have the highest weight, but the next highest weights would be the center/SG. The idea being that when a PG gets beat, the responsibility would fall more on those two positions than SF/PF. I'm not sure if that's true, it would depend on how the defensive rotations are setup. But I think it would be interesting to try. For SG, the other two largest weights would be on PG/SF, and so on.

      Another thing to do is take into account location. For example, if a PG scores on a 3, that should mainly be the responsibility of his counterpart to defend (usually). If the PG drives to the basket and scores, the inside guys get a little more blame. Man, there's all kinds of things you could try.

      But I think you really need to set up a framework involving global optimization to make all this work well. Somewhere in there, the error will be minimized, but finding that spot is not trivial.

      1. Yes, that sounds right. I don't use ad hoc numbers at all on the ASPM metric I developed. I don't know much about genetic algorithms; I generated my weights and model using regression and Excel.

      2. Several different ways to try to optimize shot defense credit further and probably shot defense blame too.

        A bit more credit / blame to adjacent positions might be a way to get at switch impact on average to some degree. A bit more credit / blame based on shot location and "responsibility" is something I was also thinking about last night too.

        I think there are plenty of better models of shot defense than none at all; or equal to everyone based on team results for all minutes regardless of whether you are on the court; or even equal to all actually on the court for that play. How complicated to get is a judgment call. User acceptance may or may not be the right deciding factor.

        P.S. I got turned around giving a possible rationale for adjusting the defensive rebound credit. But it still might be interesting to look at, especially if matching up Adjusted +/- and the statistical model were a high priority.

        1. You raise a good point that I'm wondering about myself:

          "especially if matching up Adjusted +/- and the statistical model were a high priority."

          Is it? I am working under that assumption (and team +/-). Other criteria could be put into a cost function. I'm sure you guys have had plenty of conversations about this at APBR, right? If you can point me to those threads, I'd appreciate it.

        2. Obviously, the objective is to tease out a player's value from all of the data. The true measure is how the team actually fared (though that is also subject to "random" variation). Thus, some sort of cross validation where a model's estimate of player's value is checked against an independent set data is critical.

          Advanced Plus/Minus is independent from box score data, and thus very useful for those dealing with box-score based metrics. Unfortunately, it is extremely noisy due to multicollinearity issues. Thus my use of 6-year average APM numbers to generate the weights for my box-score-based ASPM model.

          This to say, "Yes, I agree that matching APM and the statistical model is critical."

          Basically, we know the true (subject to random variation) value of a bunch of 10 player combinations. The 10 individual players are very hard to separate, but the objective must always be to EXPLAIN the results the 10 player matchups gave.

  12. Matching up Adjusted +/- and the statistical model is probably not an active high priority to many who may use one and renounce the other or essentially renounce both as being "one number" and unable to deliver on their stated goal. But if you think both say something useful, then I think it should be a high priority to match up them and to blend them.

  13. I don't have immediate specific threads to point to above others but look at any Adjusted +/- or Statistical +/- thread and the odds are pretty good that was some mention of the other. I've compared the two general approaches a fair amount and called for blends for quite awhile. Not much active public endorsement from others for the latter. Still I'd say the "matchup" of the metrics / approaches is a lingering general background topic for many.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>