Chemical Engineering in My Hockey Stats? It’s More Likely than You think!

(Photo Credit: Getty Images)

This will get to hockey, I promise.

Remember high school chemistry?  The Ideal Gas Law PV = nRT (piv-nert)?  In the 1800s Boyle, Charles and Avogadro observed gases under laboratory conditions and figured out that pressure (P), volume (V), temperature (T) and the amount of molecules in the system (n) all relate to each other.  Their findings were all combined into piv-nert.

It’s called the ideal gas law because it only really applies when the gas is in ideal conditions.  It soon proved (ahem) less than ideal; it was a good start but didn’t do a great job of describing gasses at higher pressures.

So another scientist, Johannes van der Waals, examined gasses some more and came up with van der Waal’s Equation (P +a*n^2/V^2…I’m losing you here.  We’ll cut out the math).  Which was better, but still not good enough at really high pressures.

So Real Gas Laws continue to get refined, and if humans stumble across a really important gas, a study will get funded so all the thermodynamic behaviors are known.

Circling back to hockey, we’re less than 10 years into the analytics movement.  This brief and scattered chemical history lesson spans over 200 years.  Let’s acknowledge the limitations of what box scores and play by play transcripts can tell us.

My specific gripe is with zone starts.  Well, not zone starts, but how it is used as a quality of competition correction in the shot attempt stats.  Clearly, some correction needs to take place.   But zone starts has a disproportional effect on the Corsi adjustments.  In my opinion.

Zone Starts are typically broken down into offensive and defensive percentages (omitting neutral zone).  Here’s a link to Blues defenseman at has the raw numbers, but also omits neutral zone starts.  I’d really like to compare total zone starts per minute played but we’ll work with what we have.

Bortuzzo 284 266 1,066 0.27 0.25
Bouwmeester 161 198 705 0.23 0.28
Vince Dunn 379 267 1,292 0.29 0.21
Edmundson 407 409 1,431 0.28 0.29
Gunnarsson 258 355 1,018 0.25 0.35
Parayko 442 477 1,834 0.24 0.26
Pietrangelo 469 525 2,007 0.23 0.26

same group of defenseman’s ice time per game

Player TOI/G
Bortuzzo 14:48
Bouwmeester 20:08
Dunn 17:13
Edmundson 20:44
Gunnarsson 16:09
Parayko 22:36
Pietrangelo 25:44

If TOI is an indication of what Yeo thought of his defenseman’s overall game, and if zone starts are the situational opinion, why does Alexander Pietrangelo have the lowest combined zone start rate (0.49) yet the most ice time?  If Yeo thought so highly of Carl Gunnarsson defensively, why did he only play 16 minutes per game and get scratched a handful of times?   These are questions without good answers; One season isn’t a large sample size, I question Mike Yeo’s judgment on his players, and I don’t think hockey analytic bloggers have given adequate thought to zone starts.

I get the appeal.  Zone starts are a stat that are easy to calculate (was the guy on the ice when the puck dropped or not?), have a big sample size (500 statistical events in a season is a big sample size) and also give some insight into a coach’s thoughts.  I just feel that Zone Start % is a one variable correction to a statistical problem as complex as thermodynamics.

We’re stuck on the Ideal Gas Law when van der Waal’s equation will also prove inadequate.

We invite you to like Blues Rants on Facebook, and while you’re at it, join our Facebook Group!

You May Also Like:


Leave a Reply

Your email address will not be published. Required fields are marked *