Saturday, December 21, 2013

1997-98 RAPM: prior Informed (RPI)

Background for what RAPM is: +/- was a revolution for the NBA because it allowed a completely new method at evaluating players. You look at how a team scores and defends with you on the court and without you. When you set players as variables, you can use regression to calculate player impact. It's a full scope view of what matters in a game: outscoring your opponent. However, it's noisy for a number of reasons. One is that some player combinations are rare (this is known as collinearity.) Another is that the models don't deal well with players with low minutes, as they don't have enough of a sample for an accurate estimate and will often produce a ludicrous result just to "fit" the data better. 

In simple terms, RAPM deals with this by introducing a heavy dose of regression to the mean. While traditional adjusted +/- creates a model by minimizing error (the difference between the actual points per possession scored/allowed and the expected), RAPM also minimizes the coefficients in the model using a lambda term. The coefficients are reduced toward the "prior," which can be set as zero or as a set of prior values (like the previous season's result.) Players with few possessions/minutes will have results close to their priors because their sample size isn't big enough to prove to the model they're more or less valuable.

Single season advanced +/- models are interesting, but they're prone to fluky results and collinearity issues, even with RAPM. This is where prior-informed comes from: instead of assuming all the players have the same value, use a starting point based on the results of the previous season. This "daisy-chain" style produces some of the most reliable and usable +/- stats, especially after you get three or four seasons in a chain. There's no official name for this yes in basketball circles, an easy nomenclature, so I'm using something similar to NPI (non-prior informed): RPI, which means it's a "pure" RAPM model informed only by a previous RAPM model. I've put the results in the same spreadsheet but another tab for a quick comparison.

The names at the top are all great to elite players with guys who have MVPs and a few surprising results to keep things interesting. One of the purposes of +/- is to identify players who have a positive but hidden impact on the game like Shane Battier. This is like a retroactive spotlight on the underrated games of the late 90's. Mookie Blaylock is rated as the third best player thanks to fantastic defense for a point guard and plenty of offensive value, even though his TS% was 46 (he relied on the shortened line, which ended in '97.) There are a lot of debates about Nash versus Stockton, and here's fuel for the fire, as Stockton rates very well in advanced +/-. Divac wasn't invited to many all-star games, but it appears his impact is worthy of a selection. Robert Horry, the man with seven titles, also looks like more than a role player here.

 *When you reference the spreadsheet, try to include the version number. This will reduce future discrepancies.

One new tweak was attempted for this model: using a different lambda for rookies. The common technique is to give all rookies the same prior, but in a prior-informed model this can be harsh for rookies. Obviously, a player with a stat from the previous season shouldn't be treated the same as a rookie. To deal with this problem, I used the same negative prior for rookies but cut the lambda in half (this is done with the penalty factor in R.) Consequently, Tim Duncan's great rookie season has been given freedom to shine. He's rated just a hair above David Robinson, which agrees with many subjective opinions on the value of the two big men, and 22nd overall. Only four other rookies were significantly above average: Brevin Knight at +3.6, who somehow didn't make the all-rookie team; Ron Mercer at +2.0; Keith Van Horn at +1.1; and Zydrunas Ilgauskas at +0.7.

The top offensive player was Karl Malone with Barkley and Shaq close behind. Tim Hardaway and Jordan round out the top five. This shows most of Shaq's impact was on offense even when he was younger. Barkley's an interesting result since this was past his prime; perhaps he really was an offensive savant. Sweet shooters Reggie Miller and Hornacek were a short distance away from Jordan on offense, suggesting that outside shooting was quite valuable even in the 90's. As for defense, Mutombo blows everyone away with a +5.7 rating and secures his Defensive Player of the Year trophy. Mourning, McKie, Ewing, and Tyrone Hill fill out the top five. David Robinson and Olajuwon weren't far behind, but this was past their respective primes. McKie's probably one of the most underrated role players of his era, and Hill is probably forgotten but he was part of those tough defensive Iverson/76ers teams.

As a last note, ten out of the top 23 players were on an all-NBA team, while the lowest rated player was Vin Baker at +0.2 and, perhaps not surprisingly, the lowest rated all-star was Antoine Walker at -2.5.

Click here for the link to the spreadsheet.

Thursday, December 19, 2013

Portland, Indiana, and the Skewed Perception of Supporting Casts

Even after both teams lost last night, the Pacers and Blazers have two of the four best records in the league. ESPN's forecast gave Indiana a modest improvement to 54 wins from 49 last season, while the forecast also saw a five game improvement for Portland but that still landed them at only 38 wins. Yet the Pacers have destroyed their competition this season with Miami needing a few iffy calls and foul trouble for Hibbert to pull out a slim victory at home, and Portland is almost on pace to double their wins from last season. What happened? What was missed?

Legends are often compared to each other by their team success relative to their supporting casts -- Olajuwon won without another star while David Robinson had to wait for Duncan. Unfortunately, this is a messy and inaccurate method. If we can't judge how valuable the legends are and have to resort to a weird team success method, how can we be able to judge an entire supporting cast of six to nine rotation players? Not only that, but playing style and coaching can drastically alter a team's course.

Looking at Portland's additions, there's nothing to suggest a +20 win improvement. They added Robin Lopez, Mo Williams, Dorell Wright, Thomas Robinson, and CJ McCollum, who hasn't even played. Given the massive change in the effectiveness of the team, one would think Robin Lopez is the next Shaquille O'Neal. But the origin here for how teams improve (the starting point: who the new guys replace) is not the same in every case. The Blazers bench was one of the worst of all-time. For instance, ESPN writer Kevin Pelton ranked them second worst ever in terms of how many wins the bench (defined as everyone except the five most common starters) produced using his WARP metric behind only the 1999 Bulls. They shot under 40% and scored the third fewest points in the last 15 years -- and it's not like they were out there for their defense.

When working relative to that point, just adding league average players can make a large difference. Though there's another factor with the Blazers: they used JJ Hickson as the starting center, and it was a disaster on the interior defense. Hickson is barely big enough for a power forward, much less a center. He's also probably the most overrated player by his box score stats. It's been years including multiple teams and sets of teammates, and advanced forms of adjusted plus/minus continue to find as a Hickson a net negative player. So you have a player with empty double doubles playing out of position with the next best option being the disappointing rookie Meyers Leonard -- how were we supposed to know how effective the Lillard-Matthews-Batum-Aldridge core was without a competent center?

In fact, going by scoring margin, the disparity between the starters and the rest of the team last season wasn't as large as one would think. Using NBAWOWY.com, the net rating of the starters was a mediocre -0.9, while all other lineups were -4.5. However, this is largely due to Hickson. Looking at all lineups with only Lillard, Wesley Matthews, Batum, and Aldridge, the team was +1.8, while all other lineups were -6.5. In replacing players who dragged the team down along with the sort of natural improvement you'd assume from a young core including a rookie point guard, it's not entirely shocking the Blazers are a good team. It was just so hard to see that with one of the worst sets of benches and centers ever.

Season
Team
Players.......................................
Net rating
Net rating (all other lineups)
+Difference
2013
POR
Lillard-Wes-Batum-LA-Hickson
-0.9
-4.5
3.6
2013
POR
Lillard-Wes-Batum-LA
1.8
-6.5
8.3
2014
POR
Lillard-Wes-Batum-LA-Lopez
9.1
6.4
2.7

The Pacers, meanwhile, held onto their starters while shuffling their bench. Again, their additions are not world-changing: Luis Scola, CJ Watson, Orlando Johnson, Solomon Hill, and Chris Copeland. What's remarkable is that Danny Granger has not played, and he was the driving reason for most of the improvement seen in the projection of the team's season. A reasonable person would not say those players are good enough to turn a good team into an elite one, but that's exactly what's happening. It's not just the improvement of Hibbert and Paul George (one counter is David West has actually been worse this season); the bench is vastly more effective. Last season, the difference between the starters and all other lineups was enormous: roughly the same difference between the '96 Bulls and the '96 Timberwolves. Their starting lineup is as good as any lineup for a contender, and with a decent bench that pushes them into the NBA's stratosphere.

Season
Team
Players........................................
Net rating
Net rating (all other lineups)
+Difference
2013
IND
Hill-Lance St.-George-West-Roy
13.8
-4.8
18.6
2014
IND
Hill-Lance St.-George-West-Roy
14.6
4.4
10.2

After no major personnel changes, two small market teams are consuming wins at a ferocious pace. This is evidence that we have little understanding of how valuable supporting casts are. If you want to judge Lebron and Dirk by what they achieved when they were "solo," remember that the bench is more difficult to judge than the star player, and all it takes is a bad bench or a misplaced starter to keep a team from piling up wins. Going just by year-to-year changes, one would think Robin Lopez and Luis Scola are all-star players, but we're missing all the details: staring at a single Douglas fir and missing the forest.

Monday, December 9, 2013

The Nightmare in Brooklyn

After going all-in on a bet to become a competitive team now, paying more in luxury taxes than most teams do overall and shelling out first round draft picks, sinking to the bottom of a terrible conference is the worst possible outcome. With a team relying on veterans and a couple stars in their mid-30's, the present was supposed to be at least modestly productive. With injuries and terrible coaching, the blame has already largely been placed. But what else has gone wrong? Are there any signs of improvement or freakishly bad luck?

The problem with the team as a whole starts with their league-worst defense. Bringing in Garnett and Pierce, this was not supposed to be a problem, as even the elder Garnett still had a large, positive effect on defense. However, baskeball-reference lists their defensive efficiency last in the league at 110.9 points allowed per 100 possessions, or 6.3 worse than average. They've given up 100-plus points to teams like the Magic, Wizards, and the Knicks. With the huge Brook Lopez playing better defense in the middle and the addition of Garnett, this is surprising -- has Garnett really fallen off this quickly? Last season one version of adjusted plus/minus (RAPM) found he was one of the three best defenders in the league and the Celtics were nine points per 100 possessions worse on defense when he was off the court; that's a large impact substantiated by years of results.

Yet the box score stats don't see his defense declining. In fact, he's grabbing 31.4% of all available defensive rebounds, which would be a career high and there are only 23 qualifying seasons in history that better the mark. His block% is the best since his days in Minnesota and is a little above his career average, and his steal% is also slightly better than his career average. Brooklyn is one of the worst defensive rebounding teams in the league, ranked 25th, inflating his numbers, and one should never blindly trust box score stats, but blocks and steals are athletic indicators that usually precede the fall and defensive rebounding is a key component of defense. Synergy isn't pessimistic either: he ranks 157 overall on points per play allowed, which obviously isn't great but it's not a disaster either. He's only been in 12 isolation plays and given up a scant 0.42 PPP, so he's not getting burned. A quarter of his defensive plays have been post-ups, but he's only allowed 0.77 PPP, which is above average. He's not getting killed in pick and rolls either -- at 0.92 PPP allowed, he's still above average. Interestingly, roughly 43% of his defensive plays have involved spot-ups, which is a high number for a player you'd rather use for the interior.

Using the brand new SportVU data, Garnett ranks tenth out of 86 players in field-goal percentage near the rim when he's within five feet of the shot at 42.2% (for players with at least four field-goal attempts at the rim within five feet per game.) That's remarkable as the league average is near 58%, according to NBA.com. Right now the leader (guarding four attempts a game) is Taj Gibson for the Bulls. Hibbert is fifth but he also guards the sixth most field-goal attempts at the rim out of anyone in the league. But surprisingly, there's a Brooklyn player ranked higher: Brook Lopez is fourth at 39.5%. He's an improved player on defense who's fixed his mistakes by dropping back on pick and rolls, blocks a respectable amount of shots, and uses his huge frame (9' 5") fairly well. But with Garnett and Lopez locking down the paint, how can they be last in defensive efficiency?

Breaking down their opponent field-goal percentage by zone, the Nets actually perform well in two key categories: corner three's and the restricted area. They're fifth in FG% in the restricted area and 13th in attempts, while they're somehow first in corner three-point percentage and 15th in attempts per game. Defending the two most efficient spots on the floor well is the same formula the Pacers and Spurs use for their defenses. For example, the Pacers are first in both restricted area FG% and second in corner three percentage, and the Spurs fourth and 12th respectively. In attempts, the Pacers and Spurs are fifth and sixth, respectively, in attempts per game and third and sixth for corner three-pointers.

Since the Nets actually defend the two most important areas well, it's all the more impressive their defense overall is the worst in the league. Outside of the restricted area, they don't defend 2-pointers well but not horribly either. The outlier, however, is non-corner three-pointers: 44.9% is like turning every team into Stephen Curry. The next closest team's at 38.8. If the Nets defended these shots near the league average like, say, 35% on the same number of attempts, this would mean about 31 less three-pointers made or 93 points overall. That's a drastic difference of 4.65 points per game. It also translates to about 2.7 more wins over 20 games and 11 wins over a full season. Those three-pointers alone are enough to bring them close to league average on defense.

Brooklyn Nets
FG%
FGA/game
Restricted area
56.7 (5th)
25.9 (13th)
In the paint (non-RA)
41.0 (22nd)
11.7 (12th)
Mid-range
41.1 (23rd)
22.2 (12th)
Corner 3
31.6 (1st)
5.7 (15th)
Above the break 3
44.9 (30th)
10.7 (18th)
stats.NBA.com

The most popular play in the league is the pick and roll, and good defenses defend these well and know how to communicate. Thus, Brooklyn's defense is likely awful here, and you can see it in the game footage. For example, versus the Lakers when you click on Steve Blake's attempts and scroll to the fourth video, you can see Shaun Livingston give Blake a huge amount of space to pressure Gasol. He scurries back, but not quickly, and there certainly wasn't a rotation to cover his lazy double team. Playing the Rockets, the Nets give up a perfect 6/6 to Chandler Parsons from behind the arc. On his first two attempts there's a simple pick set by Howard at the line; the defender goes under and the shot goes up. This is the three-point shot happy Rockets: you have to know your opponent and fight over the pick. And on the third three-pointer, the team simply forgets about him during an offensive rebound and leaves him with as much real estate as possible. You can see these mistakes in other games too.

Howard sets a pick as soon as he reaches the half court: they run this two times in a row for open shots

Rubio handoff to Martin, then a pick by Pekovic: Nets screw this up by basically setting a pick on their own player.

With poor communication and shoddy pick and roll defense, teams are feasting at the top of the three-point line. Normally teams shoot around 35% from this area. It might also be why they defend corner three's very well because why swing the ball all the way around when you're open right away? Getting beat at the top of the floor can also explain the high percentage on midrange shots and the high foul rate for the team.

Going back to Garnett: is he a liability now? Has he fallen off that much? The advanced numbers say he's not the source of the league's worst defense, and going through the game footage this is believable. On Pekovic's second to the last field goal attempt here, you can see Garnett defend a pick and roll well (though it's easier when the ball-handler, Rubio, can't shoot) and force a missed shot from Pekovic. Going to a game where defending the ball-handler is trickier because the guards can shoot, Portland, you can see Garnett pull off his patented "harass the ball-handler and jump back to defend the roll man" move.

 Aldridge set a high pick for Mo Williams but Garnett hops out behind the line to harass him

Then Garnett darts back around to cut off an angle for a pass to Aldridge

Though if Garnett is doing well, how can this be the league's worst defense? They're significantly below average in each of the four factors: their effective field-goal percentage is too high because of the shots they give up in the middle of the floor, they don't force many turnovers, they don't rebound well, and they foul too often. With Lopez in the middle, they have always have problems on rebounds, but they also have Reggie Evans and Garnett. With some size on the wings, this wasn't supposed to be a glaring problem, but Pierce has been injured and looking old, while Kirilenko has barely played. In fact, the biggest culprit for their disappointing defense in terms of personnel is Kirilenko: he's the one athletic, active wing player they can use on high scorers, and he can force more turnovers for their deprived team. Instead Alan Anderson has played the third most minutes on the team -- RAPM models found him to be one of the worst overall defenders in the league while the Russian was a large net positive. In addition, Teletovic has played more than anyone expected, and he has trouble staying in front of NBA-level players.

Though even if their defense was much better, the offense is holding them back too. With Lopez coming into his own and plenty of outside shooting, this shouldn't have been an issue, but there has been some historically bad results from some of their top guys, including their three new weapons. There's little chance all of Pierce/Williams/Kirilenko will continue to shoot that poorly, and Deron Williams isn't even old enough for such a giant decline. Along with Brook Lopez, he's the centerpiece of the offense, and if they want to climb high in the standings he'll need to rediscover his jump shot. And Garnett's percentages are so atrocious it's inconceivable they could remain that low. He's been relying on that long two-pointer for most of his offense as he's aged into his 30's, and there's no reason it should suddenly abandon him. But it's actually been how he fares inside the paint: he's shooting 48% at the rim, which would be terrible even for a guard, after years of being above 60% and his foul rate is a mere quarter of what it was last season. We've seen slow starts from Garnett before, however, and he'll need to regain his lift again.

TS%
2014
2013
Career average
Paul Pierce
50.4
55.9
56.8
Deron Williams
49.3
57.4
55.8
Andrei Kirilenko
51.4
59.0
57.2
Kevin Garnett
37.9
53.5
54.7

Synergy has their offense as a heavy post-up system (13.4% of their plays are post-ups, one of the highest rates in the league) with a high amount of spot-up shots and plays from the ball-handler in the pick and roll. But they're 20th and 23rd, respectively, in points per play for ball-handler and spot-up plays. If Deron Williams plays like he's able to and with some regression upward toward the mean for some of their shoots like Pierce, there's plenty of room to improve. The team probably doesn't even need a sophisticated system to succeed because they have a number of isolation/post scorers.

With the worst eastern conference in history and an Atlantic division of losing teams, there's still hope to make the playoffs. There's potential too. How can these players continue to shoot so poorly, and how can they continue to allow 45% shooting from behind the line (on non-corner attempts)? There's been awful coaching and dysfunction, where the best coaching move so far has been a spilled drink, but this has been heavily highlighted by the media and changes are already underway. Given the strange management from the eccentric Russian billionaire and the GM Billy King, there's no guarantee they'll "fix" the coaching, but it's hard to believe the situation could get any more dire (let's hope Kidd's solution isn't to become the first player-coach in decades.) But with 101 million dollars spent on the roster and the loss of several first round picks, hampering a future for a team that already has 62 million on the books in 2016 just for Joe Johnson, Deorn Williams, and Brook Lopez alone, a low seed isn't a victory by itself; they want to contend.

Good luck.

Friday, December 6, 2013

1997-98 RAPM: Non-prior Informed

Background for what RAPM is: +/- was a revolution for the NBA because it allowed a completely new method at evaluating players. You look at how a team scores and defends with you on the court and without you. When you set players as variables, you can use regression to calculate player impact. It's a full scope view of what matters in a game: outscoring your opponent. However, it's noisy for a number of reasons. One is that some player combinations are rare (this is known as collinearity.) Another is that the models don't deal well with players with low minutes, as they don't have enough of a sample for an accurate estimate and will often produce a ludicrous result just to "fit" the data better. 

In simple terms, RAPM deals with this by introducing a heavy dose of regression to the mean. While traditional adjusted +/- creates a model by minimizing error (the difference between the actual points per possession scored/allowed and the expected), RAPM also minimizes the coefficients in the model using a lambda term. The coefficients are reduced toward the "prior," which can be set as zero or as a set of prior values (like the previous season's result.) Players with few possessions/minutes will have results close to their priors because their sample size isn't big enough to prove to the model they're more or less valuable.

Continuing my work on breaking down the new play-by-play data from NBA.com, here's the non-prior informed (NPI) version of RAPM. This version is less predictive as an informed RAPM set, and it often has some wonky results, but it's another tool to use in evaluating NBA players. 1998 was an intriguing season historically as it was the last for the Jordan on the Bulls, preceded the destructive lockout that delayed the '99 season, and was perhaps the last hurrah for a generation of stars who gave way to Duncan, Shaq, Kobe, et al. Going by his stats, it wasn't Jordan's best season during their six-peat, but he still nailed a crucial shot over the Jazz for a championship. It was the second straight year they met in the finals with Jordan also taking back the MVP trophy after Malone stole it in 1997. However, by RAPM and considering minutes, Karl Malone was the most valuable player.

 *When you reference the spreadsheet, try to include the version number. This will reduce future discrepancies.

Although Jordan and Malone got all the MVP attention, Shaq was an absolute monster -- a +6 RAPM on non-prior informed is enormous (all players, including Shaq, were regressed to 0 as a prior, so a high value is impressive.) Alas, he only played 2000 minutes as he missed many games with injuries. The iron-man Stockton was actually the same: he only played 1800+ minutes with nearly 20 missed games. If you scoff at these two leading the league, refer to NBA.com's raw +/- for 1998 where both players lead the league. Jordan is only 6th here and near teammate Kukoc. Strangely, Kukoc's defensive RAPM is one of the highest for the season; there's perhaps a weird interaction between those two and Pippen's missed games. This is where a prior-informed version would excel: assume Jordan has a superstar impact, Harper and Kukoc with more modest impacts, and crunch the numbers from there.

As for the leaders for each category, Barkley was nearly +5 just on offense, while interestingly Olajuwon was a paltry -0.8. This is likely a one-year hiccup due to limited data, but it's an intriguing note. Shaq was second, as that was one of his many seasons destroying the league: a 58.7 TS% with a 33 usage. Reggie Miller's shooting brings him in at number three, and MVP Malone is a close fourth. Detlef Schrempf was fifth, as the Sonics survived the loss of Kemp.

On defense, there were a few bizarre players in the top spots, but there were also the usual suspects. Mutombo won the DPOTY award, and RAPM agrees: he has the top spot. Yet Ben Wallace, and David Robinson were third, and fifth, respectively, while Kukoc and Jaren Jackson were second and fourth. As for an unheralded guy, Aaron McKie was a close sixth. He's known for being a tough defender, but not Defensive Player of the Year worthy. He was traded midseason (Pistons to the 76ers), but both teams were not enormously better with him; both actually only slightly improved on defense when he was on the team. Also, Duncan won Rookie of the Year in one of the best rookie seasons ever; RAPM agrees.

With two consecutive season files completed for RAPM, a prior-informed version can now be created for this season. Hopefully, some of the strange results like Kukoc a +4 on defense will be eliminated or reduced. Look for a prior-informed version to be posted soon along with a version of RAPM informed by a statistical plus/minus. The great MVP debates for Jordan and Malone now have more information, and it's exciting to have these advanced metrics for legends including those two players and young, explosive Shaq.

Click here for the link to the spreadsheet.

Edit: I fixed a problem in the matchup file producing better results for version 2.0.