MLB Playoff Teams’ WAR Proportion per Year

How do the best teams blend types of play? Do some teams invest most in offense, while others invest in pitching? What about defense? In this work we decompose Wins Above Replacement (WAR) into Defensive WAR, Offensive WAR, and Pitching WAR between 2008-2018. The below interactive dashboard gives a visual representation of the proportions of team WAR for 2015 playoff teams. This figure reveals that there was considerable amount of variation in how these playoff teams accumulated WAR. St. Louis achieved 0.37 of their Total WAR through offense while Toronto achieved 0.66 (nearly double) of the same. The season can be changed with the bottom slider.

The Moneyball Effect: Incentives and Output in the NBA Three Point Shootout

What happens when stakes increase during competition? The NBA Three Point Shootout offers a perfect arena for studying this problem. The goal of each contestant was to maximize their total points scored each round given twenty shots worth 1 point each and 5 shots worth 2 points each. The shots worth 2 points each are called moneyballs. Traditional economic theory suggests that players would expend relatively more time/effort on the moneyballs and hence perform better on moneyball shots compared to regular shots. However, our findings suggest the opposite, as players fared relatively worse on moneyballs. The following shot chart that we created in Tableau demonstrates the probability of making the moneyball (diamond) versus the single point shot.

The paper can be downloaded here: http://sportdataviz.com/wp-content/uploads/2019/06/MoneyballNBA.pdf

J. Potter, J. Ehrlich, S. Sanders, “The Moneyball Effect: Incentives and Output in the NBA Three Point Shootout,” Southern Business & Economic Journal, May 2019

Basketball Player Movement Between International Leagues and the NBA

This directed graph, created with R and igraph, shows how professional basketball players transition between leagues around the world. The size of the edges is based on the number of players that transitioned between the new nodes (leagues). 

Pro Basketball Players Transitioning Around the World

The NBA G League is also an important feeder for international leagues. Below is a directed graph including the G League:

Pro Basketball Players Transitioning Around the World

The following is a heat map that shows the same information, but in matrix form. It’s a little easier to compare transitions between two edges, but is less useful to understanding the network formed by the transitions.

These graphs were created for a talk that Shane Sanders and I conducted at the Midwest Sports Analytics Meeting 2018.

  • S. Sanders and J. Ehrlich, “Around the World: Rating Men’s Professional Basketball Player and League-Quality by Estimating Player Win-Value Changes across Leagues,” presented at the Midwest Sports Analytics Meeting, Central College, Pella, IA, 17-Nov-2018.

How soon do you know when an NBA game is done?

Looking at play-by-play data of every game of every season between 2004-05 to 2015-16, I was able to determine at every second whether the team that is currently winning will win the game. I averaged every second of every game (regular and post-season) to come up with the following visualization:

I then averaged each minute to come up with this visualization:

The data can be download here.

NBA Shooting while Winning versus Losing

It has been hypothesized that younger players may shoot threes better when their team is winning [1]. To see if this is true at the NBA level, I analyzed every shot of every game from the 2004-05 to the 2015-16 seasons. I found no statistical difference when shooting threes while losing or winning. I found the same to be true of free throws, but I did find a difference (p=0.037) when shooting twos. Interestingly, when the team is losing, the average player shoots 46.28%, but when winning the average goes down to 45.78%. However, the effect size is d=0.03, which is small.

Here is a density map of the three-point shots, along with the analysis:

t = -0.80358, df = 11738, p-value = 0.4217
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.009613867 0.004023235
sample estimates:
mean of losing mean of winning
0.2918484 0.2946437

Here is a density map of the two-point shots, along with the analysis:

t = 2.0842, df = 15112, p-value = 0.03716
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.0002954158 0.0096292267
sample estimates:
mean of losing mean of winning
0.4627997 0.4578374

Here is a density map of the free throws, along with the analysis:

t = 0.38493, df = 13994, p-value = 0.7003
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.004826125 0.007184844
sample estimates:
mean of losing mean of winning
0.7277869 0.7266075

The raw data can be download here.

[1] A. Glockner, Chasing perfection: a behind-the-scenes look at the high-stakes game of creating an NBA champion, First DaCapo Press edition. Boston, MA: Da Capo Press, 2016. p. 85

Height vs Weight of NFL Pro Players Scatterplot

The following visualization was created using data from pro-football-reference.com. This chart includes information from every NFL player that has passed away. You can easily see the correlation between a large weight to height ratio and an early death. Of course, correlation does not imply causation and we’ll be exploring other possible causes in later posts.

2014-2015 NBA Regular Season Team-Level Data Parallel Coordinate System

The following D3 visualization demonstrates a parallel coordinate system, which is useful for high-dimensional data. Try dragging and dropping the axes to rearrange the visualization.

The following paper provides an introduction to parallel coordinate systems:
Inselberg, A. (1997), “Multidimensional detective”, Information Visualization, 1997. Proceedings., IEEE Symposium on, pp. 100–107

Cross Country Rank-Sum Cycle Visualization

The following visualization demonstrates both independence and intransitivity violations that can occur in rank-sum scoring, which is used at the NCAA cross country tournament. Each bar can be clicked to remove or add the team to the tournament. Notice when the independence option is displayed (default) that with team Z, team Y wins and without team Z, team X win. When the intransitivity label is pressed, team X beats team Y, team Y beats team Z, and team Z beats team X, causing a cycle when the teams are considered in pairs. This was featured in Justin Ehrlich’s talk at the 2016 Midwest Sports Analytics Meeting in Pella, Iowa:

• J. Boudreau, J.A. Ehrlich, S. Sanders, and A. Winn “Social Choice Violations in Rank Sum Scoring: A Formalization of Conditions and Corrective Probability Computations,” Mathematical Social Sciences, vol. 71, 2014, pp. 20-29; DOI 10.1016/j.mathsocsci.2014.03.004

• J. A. Ehrlich, S. Sanders, and J. Boudreau “Rank Sum Aggregation: Origin, Applications, and Social Choice Characteristics” presented at the 2015 Illinois Economic Association conference, Chicago, IL, 2015.

• J. Ehrlich, “Computational/Visual Analysis of How the Scoring Method for Cross-Country Running Competitions Violates Major Social Choice Principles,” presented at the Midwest Sports Analytics Meeting, Central College, Pella, IA, 19-Nov-2016.