08 November 2017

Coaches, managers, players, and fans all have good reasons to care about statistics.  Gathering the right kind of information about when and how often something happens—a team winning, a player achieving a record, or a group players adopting a new technique—promises insight about why it happens, and what its downstream effects might be. For a fan, this kind of understanding is valuable in itself; for someone with skin in the game, it can provide the control necessary for building a winning team. But moving from correlation to causation is famously fraught.

In a type of statistical puzzle known as Simpson’s Paradox, the correlations themselves look contradictory. The popular mathematician Martin Gardner describes it this way: “the data will confirm each of two hypotheses, but disconfirm the two taken together.” Basketball examples provide a way to make this abstract idea more concrete.

Consider field goals, which can be divided into two-point field goals and three-point field goals. To determine a player’s talent for scoring field goals, we might consider what percentage of their attempts are successful. According to the WNBA statistics for the regular 2017 season, Sue Bird outperformed Sancho Lyttle in three-point field goals, succeeding at 39.3% of her attempts compared to Lyttle’s 25.0%. We can calculate that Bird also outperformed Lyttle in two-point field goals, 46.8% to 44.4%. But Lyttle outperformed Bird in field goals overall, 43.5% to 42.7%.

In other words, Lyttle appears to be worse than Bird at scoring two-point field goals, and worse than Bird at scoring three-point field goals, but somehow better at scoring field goals overall. What should we make of this confusing data?

The solution is that three-point field goals are harder to make than two-point field goals, and Bird attempted a greater percentage of her shots from behind the three-point line. If you had the two players shoot from the same place on the court, you would expect Bird to succeed more often. But since they don’t shoot from the same place on the court, and Lyttle takes easier shots a greater percentage of the time, she ends up succeeding at more of her attempts.

Another example of Simpson’s Paradox involves the “hot hand” phenomenon. Fans have long held that players experience runs of success or failure. If a player succeeds at sinking a free throw, the theory goes, they’re having a successful streak, which makes them more likely to sink the next free throw. In 1985, the psychologists Gillovich, Vallone, and Tversky published a statistical argument that the hot hand is a myth. They considered free throw data for 9 major players for the Philadelphia 76ers in the 1980-81 season, and showed that eight of them were slightly less likely to succeed after a run of successes than a run of failures.

Statistician Robert Wardrop suggested that Simpson’s Paradox might explain why the “hot hand” phenomenon looks real, even if it’s not. Players vary in their shooting ability. If a player sinks a free throw, that player is more likely to have the skill to sink their next free throw; if a player misses, they are more likely to be a less skilled player who misses their next shot. But holding the player fixed, a successful free throw is no more likely given a previous success than it is given a previous failure.

(In a more recent development in the “hot hand” story, economists Miller and Sanjurjo argue that there is a flaw in Gillovich et al’s 1985 study: even if there is no relationship between success on one shot and success on the next, it’s not a good idea to consider free throws that happen after a run of successes. This will bias you toward looking at free throws that miss. But that problem seems to be independent of Simpson’s Paradox.)

Is the moral just that correlation is not causation? I think there’s more to it than that. Looking at correlations between variables can tell a causal story, but that story may not be a simple one. Our WNBA example showed that a player’s success at field goals depends on at least two factors: where on the court she shoots from, and her aim those shots. Computer scientists have developed a more general theory about how to uncover causal information based on correlations, which handles cases like our basketball examples. Maybe all that data can tell us something useful, not just about who wins basketball games, but about why.

(Thanks to Reuben Stern for sports discussion, and to Jeremy Lizakowski for teaching me how to use Ruby to search WNBA data tables.)

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005