Thursday, July 30, 2009


I'm almost certain that this is Roger Neilson.

Kevin Leonard published this paper in 2004: Critical Success Factors Relating to Healthcare's Adoption of New Technology: A Guide to Increasing the Likelihood of Successful Implementation.

You may or may not care about management technology in health care, but Case Study 4 is hockey related.

It is excerpted below, click to enlarge.

Roger was so far ahead of the curve it was frightening, and he had a tremendous impact on NHL coaching that is still felt today. In an age when change and quantitative player evaluation, beyond newspaper stats, was mocked in every sports league except the NFL, Roger was changing the game from the inside. And over a decade later the lessons he learned, from the implementation alone, are being used as a case study for other industries.

The consultants that built the NYR database were hired to do the same for other teams. Tonnes of people can program a hockey database, but few bring Neilson's insight with them (even if they didn't understand it, they have the model). So the ideas spread through the NHL pretty quickly.

In this corner of the internet, I think we're starting to catch up to the hockey stuff Roger was doing 20 years ago. More correctly, once we have fully implemented Dennis' scoring chance data, then we'll be getting close. I would like to think so, anyways.

The last sentence of the case study is interesting, from a hockey point of view. "The most important part was that the new technology not only provided insight into the team performance but also became the only way to ultimately measure the team's success over time". At a point in time when using anything other than wins and losses to assess an NHL team was considered asinine, Roger was at least two steps ahead of the pack.

MacTavish and Keenan would have inherited this database as a Rangers coach. Much more importantly they would have inherited access to a way of thinking about the game and evaluating players. Tom Renney too, for that matter. Unless, of course, Glen Sather cleared out all of Neilson's old stuff to make room for a new walk-in humidor.

Monday, July 27, 2009

Well kids, what did we learn today?

Duh, I dunno.

That was usually my answer, but I hope the Oilers can get better ones from Sam Gagner and Andrew Cogliano.

I'd like to look at the 5-on-5 results of these two players over the past two years to see if we can see signs that they're starting to "get it". Actually I think Vic looked at this for Gagner after Year One, but I may have dreamt that. Nevertheless, it's worth looking at now. I don't intend this to be a definitive study but I do have some assumptions and hypotheses that I am working from:

- Assumption: 20 game samples should be large enough to show something, especially if we focus on measures where the numbers are reasonably large like shots and faceoffs. It's still probably not enough but likely the best we can do.

-Hypothesis #1: Both players were really really bad to begin with but have trended upwards ever since.

- Hypothesis #2: They're nearly capable of treading water at 5-on-5 given the typical liberal amounts of protection afforded these types of players. By this I mean favourable starting positions and generally actively seeking easy matchups.

- Hypothesis #3: Gagner is showing better progression than Cogliano at 5-on-5. (I'm just throwing this one out there. To my eye, this is the case but as always the eyes are often deceiving.)

- Important note #1: It might be hard to isolate these two players because they have played together alot - especially in 07/08 - so the last hypothesis will probably the most difficult one to address with much certainty. I could attempt to further look at them with and without each other, but I think that will kill the sample sizes. For the time being, I'm willing to just accept this as a limitation.

- Important note #2: I'm going by games played and I'm not looking at icetime. This could be a big deal, but I'm blatantly avoiding it right now.

First things first, let's get an idea of how sheltered these two players were on the Oilers. Probably the best way to look at this is to check out where they have been starting their shifts relative to the team. Using, I have the Oilers' team faceoff differential (offensive zone minus defensive zone) vs. the starting faceoff zone differential for Gagner and Cogliano for the last two seasons in 20 game increments (22 games in the last one):

Ugh, this team sucked in 07/08. But anyway, Gagner and Cogliano have been sheltered quite a bit as Oilers under Craig MacTavish, particularly to start the 07/08 season. The beginning of that year was crazy - the Oilers were pinned in their zone at 5-on-5 and yet MacT managed to get the kids an even shake. That's no small feat. Now the team is predictably still in the red in this department, but better, and the two players in question still enjoyed sheltering to some degree right up to the present.

Just to support this point a little bit, we can look at the Gabriel Desjardins' Quality of Competition player rankings within the team. Among forwards in 2007-2008, Gagner was 9th and Cogliano was 12th, but they were very close. Meanwhile last year, they moved up to 8th and 9th respectively. I think we can be reasonably confident that MacTavish ensured that they both started their shifts favourably in terms of zone and opposition.

Now for some results:
At least 2007-2008 fits what we might expect and basically what I swear Vic posted last offseason. They were pretty bad to start 2007-2008 but as we can see by the zone shift chart below (which is really just graph 2 minus graph 1, or Shift End Diff. - Shift Start Diff.) these guys were treading water after their first season. It's also interesting to me that although the team's year-end performance in 2007-2008 was pretty unsustainable, the kids did finish off the year at a pretty decent clip. After looking at a decent underlying trend, can we still say MacT was totally nuts for starting to believe the hype a little bit?

Gagner looks pretty reasonable after the first quarter of 2008-2009 by the Zone Shift measure despite all the bad press he was getting because he couldn't buy a goal. His line is pushing the puck forward during the first 20 games of 08/09 and that's a wonderful thing.

However, it appears that Cogliano and Gagner start to deviate at this point last year. Gagner just trends downwards after this point while Cogliano seems to be holding steady. Is this real or some kind of artifact? An injury maybe? I don't know, but it is interesting. There must be some kind of explanation and I'm taking suggestions. Including that there's not much value here.

Lastly, there's the Corsi. Frankly this is kind of all over the place and is also less flattering than the faceoff position data. It really doesn't look like they're getting better by this measure. I think this is probably just an example of how young players don't develop in a linear fashion. Furthermore, it's probably also a good example of how variable the actual level of play can be for players in their first and second years in the league. I imagine that vets see their GF/GA data move around a lot, but their Corsi and faceoffs are probably a little more consistent.

Friday, July 24, 2009

The 'Shot Quality' Fantasy

You can't swing a cat on the internet without hitting a goaltender apologist. Every time that a goaltender has a bad year in terms of save percentage, frightening and unrecognizable excuses come squirting out of every orifice of the hockey internet. A defensible rationalization would be "He was just unlucky, the shooters made their shots this year. Relax." But you never hear that one. It is always the defensemen, coaches, or perceived tactical changes that take the internet beatdown.

The question is: How much of these changes were required to happen by the existence of chance in the universe, and how many were the fault of the coaches and skaters?

The Contrarian Goaltender had a very cool idea a while ago, he looked at goalies that changed teams. The thinking was that if teams were the ones impacting the save percentage, it would reveal itself when the goalies switched squads. I'm giving the same idea a more rigorous test here. This is the list of every goalie's stats over the past ten NHL seasons. I'd advise that you calculate your own percentages if you are using these numbers to make a point, beyond that they seem correct.

The varying number of 5 on 3's team to team, that skews the PP save%'s a bit, so we'll just use even strength save% here (EVsave%). The first group of players we'll look at are the goalies that played for the same team two seasons in a row. lists the player's team as the one he played for when the season ended, if he started the season in Edmonton and ended up i Pittsburgh he'll just show up as a Penguin. So rather than look up all the trade information manually, I picked seasons for goalies that qualified as playing the previous two seasons for the same team. They also had to have faced 300 EV shots in both of the seasons. 196 goalie seasons qualify.

Next, I built a model, using the players total 1999-2009 EVsave% as the weighting of his coin, and the number of shots from each season as the number of flips. And I ran it 100 times.

And it turns out that the universe requires an average change, from season to season, to be .0103 for these guys. Of course it varies a bit from virtual season to virtual season because, as shown in the post below, randomness takes an awfully long time to truly settle out, especially when we're looking at such fine detail.

The actual change for these 196 goalie seasons was .0108. Even though that increase over virtual could be just chance, the distribution of the actual is also a whisker wider than the model, so I'll concede that the combined factors of coaching personnel and philosophy changes, defender changes, tactical shifts, the goaltender's own rebound control and puck handling abilities, and the 27 other notions swimming around in Kelly Hrudey's head ... they add together to create a change in 'expected goals against', from season to season, for the same goalie on the same team, of almost 1 goal. And I'm clearly being generous with that.

Now for changes in EVsave% for players who played a complete season for one team and then the complete next season for another team. To do this I had to manually go through the database and make sure that the goalies weren't traded during the hockey season. I ended up with a list of 66 guys, the same minimum 300 EV shots each season cutoff applies here also.

And the universe requires that the EVsave% change as the goalie moves from team to team, if shot quality doesn't exist at all, well it will average .0122. With these 66 goalies it averages .0133. There is nowhere near enough veracity to even declare that a difference in shot quality exists at all, but again the distribution is a smidge wider with the real than it is with the random, so I'll give the benefit of the doubt and say that, on average, the difference in shot quality from any two teams selected at random ... the expected difference will be 2 goals on the season.

And I haven't even broached the issue of whether or not goalie sellers outperform goalie buyers on the whole (they do by the way, persistently but just by a touch) because that would carve a big chunk out of the tiny bit of evidence for shot quality that exists. I'm erring well on the side of caution here.

In short: The universe thinks that the NHL 'Shot Quality' metrics are a complete crock of shit, but I like most of the people that create them.

Wednesday, July 22, 2009

Shooting Percentage and False Perception

A lot of the time gut feel works just fine in assessing a player, and a guy's ability to finish is a big part of that. There is no doubt that Heatley has more finish than Dvorak, the memories of Dany's highlight reel goals and Dvorak's blown chances as an Oiler ... these colour our opinions, and their history of shooting percentage confirms them.

Now playing on the powerplay helps a guy's shooting percentage a lot, especially if he is the trigger man and if he gets a lot of 5 on 3 powerplay time as well. Getting a few empty net goals really helps this number as well. So here we'll just look at even strength goals that happened with both team's goalies on the ice (EVshooting%).

Now everyone knows that even the league's premier goal scorers will have cold stretches this season, almost all of them will have a month (20 to 30 EV shots) with a brutal EVshooting%. And while some fans will get excited about it and start searching for reasons, most will accept that it's just the Hockey Gods in action. If Iginla has a stretch of five weeks with just one EV goal (say 1 goal on 30 shots, 3.3% EVshooting%) most around here will chalk it up to random chance. And they're probably right. has shown that the pattern of EV shooting% is very nearly identical to that expected by chance alone, granted only four players were studied. Still, it's impressive considering that line mates, injuries and psychological elements are surely factors. So while those things are likely all in play, they are extremely difficult to detect through the noise of luck. Plus I've never read or met anyone who has been able to predict future shooting% trends.

So we're good at sensing the level of luck involved with the small samples, where the human mind let's us down is over a larger number of games and shots. It feels like a season, or certainly two seasons, should be enough to give us a good gauge of a player's true finishing ability at even strength NHL hockey. My own gut feel would be "within one or two percent" after two seasons, and I would be wrong. And going by the things I read on the internet, I suspect that most people would mentally put a narrower error band on it than me.

Below is the confidence interval for a mythical player who we know to have a natural 10% EVshooting% ability. If Tyler is right, or largely right, then this is the range of results that we should expect to see from the player 95% of the time. And if we looked at 100 identical players, after any given interval 5 of them would be outside this range of shooting%.

A top six player will probably get about 130 EV shots in a season. Less than that if he is playing on a weak team, more than that if he is playing on a territorially dominant team. And a 10% EVshooting% ability is about what you'd expect from a top-six-ice-time type of forward. You'd want more than that if the player brought little else to the table, and could live with less than that if he had a wider range of skills. But that's a reasonable midpoint.

So the chart above reflects a total of about four seasons on a good team, and about five seasons on a weak team.

After 60 even strength shots, about half a season, our man's 95% confidence interval ranges from 2.2% to 17.4%. That's a hell of a swing. That means that in 100 parallel universes, our man would be expected to have an EVshooting% in that range 95 times. And the universe requires him to be higher or lower than this 5 times on average. It's cruel in a way, if our man doesn't have a track record in the NHL, then some of the parallel universe versions are being rewarded with rich NHL contracts, while in other parallel universes our man is being buried in the minors or shuttled off to Europe. All at the caprice of the hockey gods.

Speeds has an excellent post up on the most recent Oiler first round draft pick. Every scout seems to have seen him good, but the offensive numbers aren't where we would want them to be, it appears that he hasn't scored enough to justify the draft position. Speeds points out that he had a poor shooting%, and wisely suggests that it could be either because he doesn't have much finishing ability or that he was simply unlucky. And that we have no way of knowing which is true until time reveals more information.

And even for a guy like Cogliano, who has shot the lights out so far, he has only fired about 200 EV shots thus far in his career. So all we really know is that he is very likely (about 95% likely) to have a true EVshooting% within 5% of his rate so far.

So how good is Cogliano at finishing his chances? The skeptics and determinists can send their poets out to battle each other all day and night, but the fact of the matter is that we just don't know. We do know something close to the true probabilities, but it's still such a wide band that very few people would be foolish enough to bet the rent on him maintaining his current level.

The same intervals apply to on-ice EVshooting% for all players as well, though the larger sample (i.e. number of EVshots) is typically about four times higher. The same also applies to goalies and EV save percentage of course. So while we are judging goalies largely by EVsave%, which is sensible, we always have to remember just how volatile these can be in the short term. And while we can predict the behavior of the population, we can't predict which ones will throw more than their expected number of bullseyes.

This summer, teams that paid for players based on results that came from a year or two of good percentages ... they'll most likely regret it. Teams that have paid for players with good track records but a recent stretch of poor percentages, those were probably good bets.

Saturday, July 18, 2009


As most of you know, I am a goldsmith. You may not know that after the Pronger trade Oilers blogger Lain Babcock commissioned me to make eight coins. Each coin was to have the head of Joffrey Lupul on one side and Ladislav Smid on the other. He kept one for himself and sold the other seven to other Oiler bloggers. I'm still surprised that Dennis bought one, by the way.

Due to cheap materials and the hand crafted nature of these items, they weren't fair to use in a coin toss. Some tended to flip more Lupuls, others to flip more Smids.

Anyhow, shortly after their manufacture the game of Lupulsmid was born. If you click on the link you'll see the rules explained. It's a head to head coin flipping game which is always played for money. Because like horse racing and NFL football, it's pretty tedious without wagering in play.

On that link is also a table showing the weighting of the coins, which I tested before shipment. There is also a table of the history of head to head matches between the eight owners of the coins. And the final results, expressed as winning percentage, of the eight flippers. This will be different each time you load the page.

The cumulative results from 1000 parallel universes are also shown in the bottom table on lupulsmid.html, as linked above.

The question is: if we didn't know the weighting of the coins, could we figure it out from the results? In other words, how accurately can you calculate the quality of competition in Lupulsmid using just results?

Since I know the quality of competition effect for each coin flipper (remember I made and tested the coins) simple arithmetic yields this:

But if we don't know the weighting of the coins, it's trickier to do.

Let's try the Desjardins methodology:

Looking at the big sample for now, from 1000 parallel universes:

From Lain's point of view, he played 300 matches against Tyler. And Tyler had a winning percentage of .529, so we multiply 300 * .542 and get 162.6.

He played 100 matches against slipper. And slipper had a winning percentage of .524, so we multiply 100 * .524 and get 52.4.

Do the same for the other five guys that Lain played against, and divide by the total number of games and voila! His averaged competition was 49.5%, or slightly weaker coins than average. So his Desjardins Lupulsmid QualComp is -.5%

The complete list is as follows:

The red bit is just Desjardins' methodology parlayed through. So we look at Desjardins' results and think "Damn! Rivers is better than I thought, he just foolishly chose to play too many matches against the guys with good coins (Tyler and Lain). Maybe we should bump Rivers' coin weighting guesstimate up a touch and run Desjardins' methodology again."

And the same for everyone else. And we take the results and adjust again, over and over until it doesn't seem to be making a difference. Or at least not enough difference to have a material effect on Lupulsmid wagering.

So you can see that Desjardins' method worked well, and parlayed Desjardins is damn near spot on. That's a Pearson correlation of .96. for the simple Desjardins method.

On the smaller samples, one season at a time, it's going to vary a bit from one parallel universe to the next, and the average correlation between simple Desjardins and actual is .93 over the 100 samples, with almost all of them between .90 and .96.

Parlayed Desjardins averages .99 correlation for the individual universes, with the staggering majority being .98 or better.

Now the methodologies used by both Desjardins and Willis for their NHL Quality of Competition metrics don't lend itself to being parlayed through. But just the simple metrics they use give a damn good indication, and I can't think of any good reason that you'd need a finer measure.

Willis' results for the 07/08 season are here, by the way, and they correlate very strongly with Desjardins numbers for 07/08 on a team by team basis, as you can easily check for yourself.

Thursday, July 02, 2009

Apology Required?