Existing TS% Metrics in the NBA
In the NBA, efficiency metrics like True Shooting Percentage (TS%), Relative True Shooting Percentage (rTS), and True Shooting Plus (TS+) have been ubiquitous for evaluating player performance. However, these metrics have limitations that a Z-score normalized metric can address.
But before I get to these limitations, a brief review for what these metrics set to achieve -
True Shooting Percentage (TS%):
TS% is a shooting efficiency metric that accounts for field goals, three-point field goals, and free throws.
Limitations
- TS% is a raw percentage and does not account for the context within the league or the variability among players. Nor does it account for league-wide growth
- It treats all players equally without considering how they compare to the league average.
Relative True Shooting Percentage (rTS%):
rTS% measures a player’s TS% relative to the league average TS%.
Limitations
- While it contextualizes a player’s performance, it does not account for the distribution or variability of TS% within the league.
- It can be influenced by extreme values (outliers), skewing the perception of a player’s efficiency based on outliers.
- Harder to interpret due to the lack of a normalized metric.
True Shooting Plus (TS+):
TS+ is an index where 100 represents the league average, and values above or below 100 indicate better or worse performance, respectively. It is widely considered the gold standard for comparing efficiency metrics across NBA eras, although it has similar limitations to the rTS%, it does normalize to some extent.
Creating the Z-score normalized TS%
Z-score normalization converts individual data points into a measure of standard deviations away from the mean which accounts for the distribution of scores and the variability.
Z-scores are a superior method for normalizing efficiency metrics in the NBA due to their ability to provide contextual relevance, robustness to outliers, enhanced comparability across eras, and clearer interpretation. While traditional metrics like TS%, rTS, and TS+ are valuable, they lack the depth and robustness that Z-scores offer. By adopting Z-score normalization, we can achieve a much more accurate and comprehensive understanding of player performance across eras with ease of interpretation.
To understand the process behind the estimation of the metric, here’s a stepwise list -
-
All TS% scores filtered ordered and the top 90% based on the possession’s played were filtered for further analysis to remove smaller sample sizes and extreme outliers.
-
All filtered TS% scores were scaled and centred based on the gaussian(normal) distribution. To examine how well the scores distributed, Shapiro-Wilk test for normality was conducted across 32 seasons and was significant at p<0.001. (NOTE: Shapiro-Wilk is fairly flawed but still a good rudimentary judge for examining the distribution).
-
All the Z-scores were converted back to percentile norms for ease of comparison using the normal distribution. You can refer to the probabilities of a standard normal distribution to get a better idea for the conversion.
-
The final output was a new zTS% metric that allows easier comparison across eras.
How is it an improvement?
- Robustness to Outliers:
- Outliers are extreme values that can disproportionately affect statistical analyses.
- Z-scores reduce the impact of outliers by contextualizing each player’s performance within the league’s overall variability. Extreme values have less influence on the mean and standard deviation compared to how they would skew simple averages or ratios.
- Enhanced Interpretability:
- Z-scores standardize the scale of measurement, making it easier to compare performances across different seasons or contexts. A Z-score of 1.0, for example, consistently indicates one standard deviation above the mean, regardless of the underlying distribution’s specifics.
- Translating Z-Scores into Percentiles:
- By converting Z-scores to percentiles, we provide an intuitive measure of performance. A player with a zTS% of 85% is more easily understood as being in the top 15% of efficient players in the league.
Kobe, an inefficient chucker or a product of his era?
The only thing left was to put this new method to the test using the most controversial player of the modern era - Kobe Bryant. Kobe, who’s often considered an inefficient player compared to several stars.
PLAYER_NAME | NICKNAME | GP | MIN | POSS | year | TS_PCT | TS_PCT_Z | TS_PERC |
---|---|---|---|---|---|---|---|---|
Kobe Bryant | Kobe | 71 | 15.5 | 2211 | 1996-97 | 0.54 | 0.37 | 64.48 |
Kobe Bryant | Kobe | 79 | 26.1 | 4157 | 1997-98 | 0.55 | 0.73 | 76.79 |
Kobe Bryant | Kobe | 50 | 37.9 | 3676 | 1998-99 | 0.55 | 0.83 | 79.61 |
Kobe Bryant | Kobe | 66 | 38.2 | 5047 | 1999-00 | 0.55 | 0.70 | 75.87 |
Kobe Bryant | Kobe | 68 | 41.0 | 5460 | 2000-01 | 0.55 | 0.91 | 81.93 |
Kobe Bryant | Kobe | 80 | 38.3 | 5991 | 2001-02 | 0.54 | 0.72 | 76.47 |
Kobe Bryant | Kobe | 82 | 41.5 | 6699 | 2002-03 | 0.55 | 0.88 | 81.00 |
Kobe Bryant | Kobe | 65 | 37.7 | 4853 | 2003-04 | 0.55 | 0.94 | 82.72 |
Kobe Bryant | Kobe | 66 | 40.8 | 5196 | 2004-05 | 0.56 | 0.90 | 81.63 |
Kobe Bryant | Kobe | 80 | 41.0 | 6371 | 2005-06 | 0.56 | 0.74 | 77.06 |
Kobe Bryant | Kobe | 77 | 40.8 | 6263 | 2006-07 | 0.58 | 0.94 | 82.65 |
Kobe Bryant | Kobe | 82 | 38.9 | 6488 | 2007-08 | 0.58 | 0.92 | 82.04 |
Kobe Bryant | Kobe | 82 | 36.1 | 5911 | 2008-09 | 0.56 | 0.59 | 72.31 |
Kobe Bryant | Kobe | 73 | 38.8 | 5570 | 2009-10 | 0.54 | 0.23 | 59.12 |
Kobe Bryant | Kobe | 82 | 33.9 | 5361 | 2010-11 | 0.55 | 0.33 | 62.82 |
Kobe Bryant | Kobe | 58 | 38.5 | 4332 | 2011-12 | 0.53 | 0.21 | 58.42 |
Kobe Bryant | Kobe | 78 | 38.6 | 6020 | 2012-13 | 0.57 | 0.85 | 80.23 |
Kobe Bryant | Kobe | 35 | 34.5 | 2434 | 2014-15 | 0.48 | -0.94 | 17.42 |
Kobe Bryant | Kobe | 66 | 28.2 | 3820 | 2015-16 | 0.47 | -1.34 | 9.07 |
Based on the above, you can say that Kobe has always been around 80th percentile in terms of league efficiency across most of his prime, or you can simply state that he was more efficient than 80% of the league during his prime.
Similarly for Lebron -
PLAYER_NAME | NICKNAME | GP | MIN | POSS | year | TS_PCT | TS_PCT_Z | TS_PERC |
---|---|---|---|---|---|---|---|---|
LeBron James | LeBron | 79 | 39.6 | 6053 | 2003-04 | 0.49 | -0.36 | 35.85 |
LeBron James | LeBron | 80 | 42.3 | 6423 | 2004-05 | 0.55 | 0.72 | 76.45 |
LeBron James | LeBron | 79 | 42.5 | 6404 | 2005-06 | 0.57 | 0.92 | 82.08 |
LeBron James | LeBron | 78 | 40.9 | 6124 | 2006-07 | 0.55 | 0.40 | 65.67 |
LeBron James | LeBron | 75 | 40.4 | 5831 | 2007-08 | 0.57 | 0.77 | 78.06 |
LeBron James | LeBron | 81 | 37.7 | 5769 | 2008-09 | 0.59 | 1.18 | 88.17 |
LeBron James | LeBron | 76 | 39.0 | 5728 | 2009-10 | 0.60 | 1.40 | 91.94 |
LeBron James | LeBron | 79 | 38.8 | 5916 | 2010-11 | 0.59 | 1.23 | 89.03 |
LeBron James | LeBron | 62 | 37.5 | 4533 | 2011-12 | 0.60 | 1.60 | 94.53 |
LeBron James | LeBron | 76 | 37.9 | 5549 | 2012-13 | 0.64 | 2.16 | 98.48 |
LeBron James | LeBron | 77 | 37.7 | 5634 | 2013-14 | 0.65 | 2.28 | 98.86 |
LeBron James | LeBron | 69 | 36.1 | 4889 | 2014-15 | 0.58 | 0.98 | 83.66 |
LeBron James | LeBron | 76 | 35.6 | 5344 | 2015-16 | 0.59 | 1.10 | 86.44 |
LeBron James | LeBron | 74 | 37.8 | 5655 | 2016-17 | 0.62 | 1.48 | 93.11 |
LeBron James | LeBron | 82 | 36.9 | 6250 | 2017-18 | 0.62 | 1.41 | 92.11 |
LeBron James | LeBron | 55 | 35.2 | 4218 | 2018-19 | 0.59 | 0.70 | 75.76 |
LeBron James | LeBron | 67 | 34.6 | 4910 | 2019-20 | 0.58 | 0.32 | 62.68 |
LeBron James | LeBron | 45 | 33.4 | 3145 | 2020-21 | 0.60 | 0.67 | 74.88 |
LeBron James | LeBron | 56 | 37.2 | 4380 | 2021-22 | 0.62 | 1.04 | 85.11 |
You can see that he’s hovered around the 90th percentile during his prime and that number jumps up to 98th percentile during 2012-13 and 2013-14.
Finally, for my favourite player Nikola Jokic -
PLAYER_NAME | NICKNAME | GP | MIN | POSS | year | TS_PCT | TS_PCT_Z | TS_PERC |
---|---|---|---|---|---|---|---|---|
Nikola Jokic | Nikola | 80 | 21.7 | 3488 | 2015-16 | 0.58 | 0.98 | 83.58 |
Nikola Jokic | Nikola | 73 | 27.9 | 4264 | 2016-17 | 0.64 | 1.90 | 97.10 |
Nikola Jokic | Nikola | 75 | 32.6 | 4983 | 2017-18 | 0.60 | 1.05 | 85.43 |
Nikola Jokic | Nikola | 80 | 31.3 | 5140 | 2018-19 | 0.59 | 0.72 | 76.37 |
Nikola Jokic | Nikola | 73 | 32.0 | 4733 | 2019-20 | 0.60 | 0.80 | 78.91 |
Nikola Jokic | Nikola | 72 | 34.6 | 5084 | 2020-21 | 0.65 | 1.47 | 92.93 |
Nikola Jokic | Nikola | 74 | 33.5 | 5136 | 2021-22 | 0.66 | 1.77 | 96.18 |
Nikola Jokic | Nikola | 69 | 33.7 | 4873 | 2022-23 | 0.70 | 2.14 | 98.40 |
Nikola Jokic | Nikola | 79 | 34.6 | 5599 | 2023-24 | 0.65 | 1.42 | 92.17 |
He’s an efficiency god, with multiple seasons in the 90th percentile.
How to Access the data
Unfortunately, I am still in the process of creating the app so all the data is publicly available but in the meantime if you’re interested in data for a single player, I created a hacky solution for you - https://codepen.io/Gareth-Bale-the-bashful/pen/qBzZejW
I love this
Any way you could do this for playoff data?
So basically you are trying to work on a WAR for basketball?
This is great, good look on further work with it
99.71% for 2016 Steph. I wonder what’s the closest a volume shooter got to 100%.
This the kinda shit I’ll see in a jxmyhighroller video and have to pretend like I understand
Why is there only data for the play-by-play era? Wouldn’t this all be calculated just with basic boxscore stats?
This is great. I think the ultimate stat would be to take this and add in both shot quantity and shot difficulty. I would imagine Kobe would absolutely dominate that combined metric if he already manages 80th percentile with how ludicrously hard his attempts were and just how much he shot.
This is awesome! I was just talking to a friend about how z-scores would be a better tool for evaluating GOATs across eras and would love to see it for the advanced metrics like PER, win shares, and so on. The GOAT debate is more emotional than it is data driven (some people just have stronger attachments to some players and that’s fine since that’s sort of what sports fandom is anyway), but it would still make it easier to argue who is “better” when there are too many confounding variables to make direct comparisons across positions/eras actually useful. Well done!
westbrook’s numbers got me fucked up rn
Something that’s always bothered me about rTS% is that while correcting for eras having different scoring environments is a good idea, at least some of the recent spike in shooting percentages is due to improved decision making, right?
If two players have the same rTS%, but the first guy did it back when his competition was regularly shooting contested 20 footers, isn’t the second guy still marginally better?
So just elementary statistics
Well, I could use more advanced methods and find non-linear combinations several efficiency metrics, convert them back into z-scores and do the same. But, this is simple and intuitive. Parsimony is the heart of statistics, I won’t go run UMAPs or Neural networks for simpler statistical designs just to show off.
It’s fine, I just think it’s funny how “let’s look at the distribution” just blows the mind of stats noobs.
Yes.
But have you seen the replies? This is impressive stuff to most readers.
The real shocker is that no one has done this before. The NFL has been using a z-score normalized rating (DVOA) for over a decade. Bill Barnwell has talked in terms of z-scores since his Grantland days.
The NBA is probably on the cutting edge inside of league offices, but the actual reported stats are still weirdly basic.
Sorting by Stephen Curry and Kevin Durant is wild
consistent high 90th percentage. imagine if they become teammates
Durant had 7 straight seasons above the 95th percentile. Curry in 99th twice.
Not a statistician, but I wonder how much this changes based on the population of players you use to compute the average and std dev. You mentioned top 90% by possessions played, which seems like you will include a lot of scrubs. You might get a different distribution with just starters or top 100 by minutes
Kobe’s legacy being boosted by the Kwame Browns of the world
Scrubs often have solid percentages because of low volume though.
I just want to note the hilarity of running Shapiro-Wilk (which involves ordering the observations which allows you to read off percentiles directly) just so you can derive a less accurate percentile.
It’s just a test for normality, it wasn’t used in derivation of the efficiency metric. Instead of manually examining each distribution for each season, it’s just simpler to do it this way.
I did mention it’s not the best way, but the CLT is enough justification to use normal quantiles.