How soon do you know when an NBA game is done?

Looking at play-by-play data of every game of every season between 2004-05 to 2015-16, I was able to determine at every second whether the team that is currently winning will win the game. I averaged every second of every game (regular and post-season) to come up with the following visualization:

I then averaged each minute to come up with this visualization:

The data can be download here.

NBA Shooting while Winning versus Losing

It has been hypothesized that younger players may shoot threes better when their team is winning [1]. To see if this is true at the NBA level, I analyzed every shot of every game from the 2004-05 to the 2015-16 seasons. I found no statistical difference when shooting threes while losing or winning. I found the same to be true of free throws, but I did find a difference (p=0.037) when shooting twos. Interestingly, when the team is losing, the average player shoots 46.28%, but when winning the average goes down to 45.78%. However, the effect size is d=0.03, which is small.

Here is a density map of the three-point shots, along with the analysis:

t = -0.80358, df = 11738, p-value = 0.4217
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.009613867 0.004023235
sample estimates:
mean of losing mean of winning
0.2918484 0.2946437

Here is a density map of the two-point shots, along with the analysis:

t = 2.0842, df = 15112, p-value = 0.03716
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.0002954158 0.0096292267
sample estimates:
mean of losing mean of winning
0.4627997 0.4578374

Here is a density map of the free throws, along with the analysis:

t = 0.38493, df = 13994, p-value = 0.7003
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.004826125 0.007184844
sample estimates:
mean of losing mean of winning
0.7277869 0.7266075

The raw data can be download here.

[1] A. Glockner, Chasing perfection: a behind-the-scenes look at the high-stakes game of creating an NBA champion, First DaCapo Press edition. Boston, MA: Da Capo Press, 2016. p. 85

Height vs Weight of NFL Pro Players Scatterplot

The following visualization was created using data from This chart includes information from every NFL player that has passed away. You can easily see the correlation between a large weight to height ratio and an early death. Of course, correlation does not imply causation and we’ll be exploring other possible causes in later posts.

2014-2015 NBA Regular Season Team-Level Data Parallel Coordinate System

The following D3 visualization demonstrates a parallel coordinate system, which is useful for high-dimensional data. Try dragging and dropping the axes to rearrange the visualization.

The following paper provides an introduction to parallel coordinate systems:
Inselberg, A. (1997), “Multidimensional detective”, Information Visualization, 1997. Proceedings., IEEE Symposium on, pp. 100–107

Cross Country Rank-Sum Cycle Visualization

The following visualization demonstrates both independence and intransitivity violations that can occur in rank-sum scoring, which is used at the NCAA cross country tournament. Each bar can be clicked to remove or add the team to the tournament. Notice when the independence option is displayed (default) that with team Z, team Y wins and without team Z, team X win. When the intransitivity label is pressed, team X beats team Y, team Y beats team Z, and team Z beats team X, causing a cycle when the teams are considered in pairs. This was featured in Justin Ehrlich’s talk at the 2016 Midwest Sports Analytics Meeting in Pella, Iowa:

• J. Boudreau, J.A. Ehrlich, S. Sanders, and A. Winn “Social Choice Violations in Rank Sum Scoring: A Formalization of Conditions and Corrective Probability Computations,” Mathematical Social Sciences, vol. 71, 2014, pp. 20-29; DOI 10.1016/j.mathsocsci.2014.03.004

• J. A. Ehrlich, S. Sanders, and J. Boudreau “Rank Sum Aggregation: Origin, Applications, and Social Choice Characteristics” presented at the 2015 Illinois Economic Association conference, Chicago, IL, 2015.

• J. Ehrlich, “Computational/Visual Analysis of How the Scoring Method for Cross-Country Running Competitions Violates Major Social Choice Principles,” presented at the Midwest Sports Analytics Meeting, Central College, Pella, IA, 19-Nov-2016.