Friday, October 1, 2010

Wins Produced and Statistical "Skepticism"

Because I'm stuck using a Mac for the time being, at the moment I have no idea how to make the nice images I like to include in my posts; in an effort to produce some meaningful content, I'm going to do a general post on Wins Produced and productivity in basketball.

Wins Produced (WP) is the best metric for evaluating NBA players. Don't just take my word for it; read all of these articles. But there will always be "skeptics", who, when presented with statistical evidence, proven predictive power, and the background theory behind WP, just can't let go of their "gut" feelings.

With that in mind, I'm going to try to explain the rationale behind WP. David Berri, Arturo GallettiAndres Alvarez, and others in the Wages of Wins network have largely appealed to the reality-based community and relied on statistical arguments in favour of WP; I'm going to forgo the numbers and deal mostly with words.

How to win in basketball

How do you win a basketball game? It seems obvious: all you have to do is score more than your opponent. That means that doing things that get you points are helpful, as are doing things that prevent your opponent from scoring. Things that you do that do not lead to points are not helpful, as are things that lead to your opponent scoring points. So here are the four major categories of actions in a basketball game:
  1. Actions that help your team score points (positive outcome)
  2. Actions that help your opponent score points (negative outcome)
  3. Actions that prevent your opponent from scoring points (positive outcome)
  4. Actions that prevent your team from scoring points (negative outcome)
And the three derivative categories:
  1. Actions that both help your team score points and prevent your opponent from scoring points (double positive outcome)
  2. Actions that both help your opponent score points and prevent your team from scoring points (double negative outcome)
  3. Actions that don't fall into any of the above six categories (neutral outcome)

Now that we have that settled, it should be pretty easy to figure out which on-court actions help your team win. Here are some of the on-court actions that clearly belong in the first major category, "actions that help your team score points":
  • Making a 2pt shot
  • Making a 3pt shot
  • Making a free-thow
  • Getting an assist
These directly lead to your team scoring points; if you do any of these things, your team will have scored one or more points.

Here are some of the on-court actions that clearly belong in the second major category, "actions that help your opponent score points":
  • Letting an opponent score points (ie: playing terrible defence)
  • Fouling an opponent (leads to free-throws)
  • Getting a technical foul (leads to free-throws)
  • Goaltending
  • Scoring on your own basket
It's pretty clear that doing any of these things leads to your opponent scoring; if you do any of these things, the opposing team will have scored one or more points.

Now for the on-court actions that belong in the third major category, "actions that prevent your opponent from scoring points". It's actually very hard to think of on-court actions that only belong in this category (if you can think of any more, leave me a comment):
  • Preventing an opponent from scoring (ie: playing good defence)
You could probably make the argument that even this example doesn't truly belong solely in this one category.

Finally, the fourth major category, "actions that prevent your team from scoring points":
  • Missing a 2pt shot
  • Missing a 3pt shot
  • Missing a free-thow
A shot missed is at least one or more points not scored.

Now we get into the three derivative categories, which I find much more interesting. Let's go over the positive one first, "actions that both help your team score points and prevent your opponent from scoring points":
  • Grabbing an offensive rebound
  • Grabbing a defensive rebound
  • Stealing the ball from an opponent
  • Forcing a turnover
  • Drawing a charge
  • Blocking the shot of an opponent
  • Winning a jump-ball
All of these actions are things that gets your team possession; getting possession not only affords you more opportunities to score, but also prevents the opposing team from gaining possession (and thus prevents the opposing team from scoring as well).

Now let's look at the negative derivative category, "actions that both help your opponent score points and prevent your team from scoring points":
  • Turning the ball over
  • Getting an offensive foul
  • Losing a jump-ball
  • Allowing an opponent to get an offensive rebound
  • Allowing an opponent to get a defensive rebound
Again, these are ways of losing possession. The last one is a bit contrived; teams generally rebound 70% - 75% of an opponent's misses, so "allowing" an opponent to get a defensive rebound is really to be expected.

Then we have the neutral derivative category, "actions that don't fall into any of the above six categories". This is another one that is hard to think up examples for:
  • Winning an opening tip
  • Losing an opening tip
Winning an opening tip means you gain the opening possession of the first and fourth quarters, but lose the opening possessions of the second and third quarters. Similarly, losing an opening tip means you lose the opening possession of the first and fourth quarters, but gain the opening possession of the second and third quarters. The net effect of either is neutral; the purpose of the opening tip is to start the game without giving an unfair advantage to either team.

To what degree?

Now that we know which actions do what, it's up to the stats experts to do analysis and come up with formulas that can determine to what degree each on-court action is responsible for winning. Unfortunately, some on-court actions - like quality of defence - are notoriously hard to determine. Stats experts have to rely on the only consistent and measurable on-court actions that exist for basketball - box score statistics. Yes, box score statistics can't measure all defensive contributions (remember, steals, turnovers, and blocks are all defensive measures that are a part of the standard box score), but they do measure some of the defensive contributions each player makes. So, using the standard box score stats, David Berri came up with Wins Produced (WP), which explains 95% of a team's wins. It's not perfect, no; but 95% is pretty damn good, and far more accurate than anything else out there. If you are interested in the methodology behind the specific way that WP determines productivity, I won't even try to give a detailed explanation - I'll just give you the links and let you have fun without me.

I hope that this post will help all you stats "skeptics" out there to reconsider your arguments. But, if that's not the case, I hope it was at least somewhat interesting.

- Devin.


  1. Devin,
    Brilliant! I know this kind of stuff is often said in pieces and at the start of articles, but it always good to have a full run down.

    The odd part is most people agree with this type of analysis. The problem is once you say "Ok now let's try and put a number to some of these" that people get upset.

    I've been digging a lot of your explaining things in human language, and your glossary is great. Keep up the good work!

  2. "actions that prevent your opponent from scoring points" -- holding the ball for the last shot in the quarter?

  3. I like WP48, and I agree that it is the best metric.

    That said, most debate in the stats community is not over which things are important, but how important they are. And there is a lot of evidence, anecdotal or written and statistical, that show defensive rebounds as less valuable (due to diminishing returns) and shot creation as more valuable (aka maintaining efficiency with increased usage).

    I know Berri often says in the comments or posts that 'the diminishing effects on defensive rebounding are not large' but I personally have never seen any evidence that they are so insignificant as to be ignored. And I've also never seen evidence that players should be expected to maintain efficiency with more usage. So those are the real issues.

  4. Devin,

    I have to take issue with the direction of this piece; it is far too clear and precise! Many of us prefer more wiggle room for 'debate.' Your accuracy is a bit of a buzzkill.

    But seriously, nice work!

  5. Andres: I wrote this with Dan in mind - this is how I'd start explaining WP to him. Once this first basic part is done, then you move into the correlation and math stuff, which yes, must be explained separately and differently, and yes, it's where you lose a lot of people. Of course I think it's pretty straightforward stuff - I loved Arturo's post on it (the last link I included in this post) - but a lot of people out there are afraid of math and numbers. Maybe the next explanation should be "how correlation works".

    Fred: I thought about holding onto the ball - in the situation you cited, I don't think it quite works in the "actions that prevent your opponent from scoring points" category, because towards the end of your possession you are going to try to score. I think that would belong in the "actions that both help your team score points and prevent your opponent from scoring points" category. But I do think that if your team is holding onto the ball with no intention of scoring any points and won't turn the ball over due to the shot clock expiring, then you could say that holding onto the ball falls into the "denying opponents points" category - but that situation doesn't happen very often (maybe only when dribbling out the game at the very end).

    Austin: I think diminishing returns is a valid criticism - particularly when it comes to rebounding. But the fact is that WP explains 95% of a team's wins, so I have to side with Berri on this one. Having two good rebounders is always preferable to having one good rebounder, and while the two may not get as many playing on the same team as they would on different teams, there are still plenty of rebounds available. A typical team only grabs 70%-75% of the available defensive rebounds and 25%-30% of the available offensive rebounds. Having more good rebounders mostly cuts into opponents' rebounds. Also, the null position on this question would be "the diminishing effects on defensive rebounding are not large", and so it would be up to you (or someone else) to prove otherwise.

    The other point - that shot creation is "more" valuable - again, I think you have a point. Arturo recently commented on the fact that players who are surrounded by good players often see their productivity increase, and I'm sure at some point in the future he'll do a post on that. But I wouldn't say it's limited to shot creation.

    jbrett: you had me there for half a second. :)

  6. Thanks for the reply.

    Why is the null position that diminishing returns are not large? Conventional NBA wisdom seems to say the opposite, and it's implicit in the way position roles are often handled. Not to mention, I think that lack is even more telling when you compare defensive rebounds to offensive rebounds, since teammates are more likely to grab a missed defensive rebound than a missed o. rebound (meaning o. rebs are more valuable).

    In any case, the only real study of that I have seen is this: so if there's anything that refutes that study specifically, I'd be interested.

    Good to know re: Arturo's line of thinking, and it's definitely not limited to shot creation, I agree.

    Good stuff all around, many thanks.

  7. I may have a better way of thinking about clock management. Tempo either helps you score points and helps your opponent score points (fast tempo) or prevents you from scoring points and prevents your opponents from scoring points (holding the ball/slow tempo).

  8. Austin: The null position is usually the position opposite to that which you are trying to prove. If you are trying to prove that there are diminishing returns with rebounding, then your null hypothesis should be that there are no diminishing returns. It's a statistical thing...usually it's also assumed that there are no relationships between variables until there is evidence to the contrary (which is why you'll find a lot of non-religious statisticians out there).

    I looked at that link you sent - interesting, but not overly convincing, given all of the admitted weaknesses of the analysis. I can't think of a good way of testing diminishing returns...I guess that's why I'm not a statistician. But there's one thing I do know: WP predicts wins with 95% accuracy. That means that the absolute largest the effect of diminishing returns can be is only 5%, because WP doesn't include a "diminishing rebound correction".

    Fred: now you're getting somewhere! Both of those actions would do one positive thing and one negative thing - you just proved that tempo is mostly neutral.

  9. I understand the statistical concept of the null hypothesis...I just feel (even as a 'stats guy' among basketball fans) that the onus is on statisticians to disprove widely held beliefs among 'basketball people'. *shrug* not a big deal.

    And yeah, wins produced does predict wins 95% accurately. BUT that's at the team level. At the team level there are no diminishing returns for defensive rebounds. And most team builders take into account the diminishing returns at the individual level, so you rarely have a lineup with a predicted 90% d.reb rate. But it matters a lot in the evaluation of individual players, which is what WP48 is commonly used for.

    The team-level correlations don't always generalize perfectly to individuals, and I think that's the exact problem with defensive rebounding. It's the main weakness of WP48 on an individual level, and something that could be improved much more easily than, say, a defensive measure. So I'd like to see someone give it a shot, personally. That's all.

  10. Let me make sure I'm understanding you correctly: you assume that diminishing returns for defensive rebounds exists. Diminishing returns for defensive rebounds means that when two good rebounders are paired together, their defensive rebounding will suffer somewhat. Because of this, if you look at the WP48 of two good rebounders who are not on the same team, the WP48s would overstate how productive they would be on the same team? Meaning...WP48 isn't good at predicting future defensive rebounding performance?

    Am I close or way off?

  11. Yes, that's about what I was thinking.

    Berri has said that he has evidence that the diminishing returns are small - that's the kind of evidence I'd like to see.

  12. I have two responses then: first is that WP48 is relatively stable from year to year. If WP overvalued rebounding, we'd see a lot of players who have switched teams during the off-season show a more drastic change in their WP48s, and I don't think we see that (although I could be wrong, because I haven't checked it out). One anecdote that contradicts your position is (because I just did a post on him) Reggie Evans. In the 06-07 season Evans played with Nene and Camby and had a rebounding rate of 23%. The next season, on the 76ers - with really only Dalembert to steal rebounds from him - he posted a rebound rate of 19.3%. The following year in PHI he was also at 19.0% and last season in Toronto - a notoriously poor rebounding team - he managed 19.9%.

    Second is that WP is not a way of measuring "how good" a player is; it's actually a way of measuring "how well" a player performed during the season(s) in question. It measures a player's contributions to winning. There are some players who have the ability and tools to be very good players - think Allen Iverson, Carmelo Anthony, and others - but, for whatever reason (probably financial incentives) they never actually perform well. Another example - we know that, once upon a time, Raptor favourite Vince Carter had hops. He could've been a really good rebounder, but as he focused more and more on shooting and scoring, his rebounding suffered. Does that mean he was a "bad rebounder"? No, it means that, during games, he didn't rebound all that well.