Tuesday, November 17, 2009

I Hate You For Making Nilsson Our Best Winger

Carolina Hurricanes v Edmonton Oilers - Game 4

There once was a prolific commenter on the Oilogosphere who used the Internet pseudonym 'slipper'. I don't know what has happened to him, presumably he has drifted away from Oiler fandom while he waits for the organization to become a bit less demoralizing (on and off the ice).

Slipper once shared a terrific quote from a book he was reading on Roger Neilson. Roger was the coach of the Sabres at the time, and Scotty Bowman was the GM. Apparently they had a public argument in a hotel lobby, and Bowman was overheard saying "I hate you for making X our best player". I don't remember who "X" was, but he was surely a player from the Robert Nilsson tree. For that matter, my memory shouldn't be trusted, I may well be off on the details here. I welcome correction.

In any case, I hate MacTavish for making Nilsson the Oilers best winger not named Hemsky. This for a long stretch in 07/08 and well into 08/09. This is especially unforgivable because, as it turned out, Lowe and/or Tambellini weren't as clever as Bowman. They seem to have thought Nilsson WAS that good. The same applies to Gagner and especially Cogliano, and to similar extents, but I'm in the mood to pick on Robert.

Since Roger liked to match lines based on tired legs of opponents, I picked a game where MacTavish did the same. The first game in Anaheim last season, mid October. A game on the road against a deep Ducks squad with an aggressive bench coach.

Here is the shift chart strip for the first period of that game, you'll have to click and enlarge it.

The vertical green line is an ANA goal (Marchant line vs Pisani line) and the vertical blue line is an EDM goal (Cole-Horcoff-Hemsky line vs Pahlsson line and Neidermayer/Beauchemin ... they would play against each other all night, of course).

The horizontal pale green strips, the ones between the ANA and EDM players, those are ANA power plays.

Going in, you know that Carlyle is going to be jonesing to get good offensive players out against the Kid Line. They were loved in E-Town, but opposing coaches have never shown that trio a shred of respect.

1st SHIFT: After a Pahlsson v Horcoff shift, Carlyle runs out the Getzlaf line. MacTavish waits until they have played close to a full shift and the puck is heading north (courtesy of the Moreau/Pisani/Penner line), he subs the kid line onto the ice. Carlyle keeps the Getzlaf line out there.

2nd SHIFT: After a Horcoff goal, Pisani's line wins the draw at centre ice and get the puck in deep, then they sub off for the kid line. The puck ends up back in the Oiler end, Pronger's shot at net gets deflected into the crowd. The kids come off, Pisani's line comes on to take the defensive zone draw and finish the shift against Getzlaf.

3rd SHIFT: The fourth line gets bumped so the kids can go again. This after ANA scored on Pisani's line. They played a much shorter shift than Getzlaf's bunch a couple of minutes ago, so this is a safe zone. It ends 17 seconds later with Gagner taking an interference penalty in the Oilers end of the rink against the Morrison line.

4th SHIFT: Getzlaf, Perry and Kunitz have all played on the Ducks' power play, so another safe zone.

5th SHIFT: The Pisani line plays the first 40 seconds against the Getzlaf line and eventually get the puck headed north. They sub off for the kid line. Carlyle chooses to leave the Getzlaf line out there again. The kids don't manage any pucks at net, but these long shifts are going to start taking their toll on the Getzlaffers.

6th SHIFT: The Pisani line plays the first 30 seconds against the Getzlaf line and eventually get the puck into ANA territory where the puck goes into the netting. MacTavish sends out the kid line for the offensive zone draw. Carlyle chooses to keep the Getzlaf line out there. The shift ends when Nilsson takes a hooking penalty in the Oiler zone (vs Morrison line).

7th SHIFT: After the ANA power play is cut short by a Getzlaf hooking penalty, Cogliano and Gagner come on to play some 4v4 hockey vs Morrison and Marchant. Obviously the face off is in the ANA end of the rink.


And on and on it goes. And the kid line ends up with the best underlying numbers and the best counting numbers on the night. Nilsson had the best shots+/- and corsi+/- of any Oiler forward. He also finished +1 at even strength, and he and Cogliano scored assists on Souray's goal late in the second period.

The kid line would also appear to have played the toughest opposition of any Oiler group of forwards, they ended up with more ice time against Getzlaf's line than anyone else. If I remember right, Desjardins QualCOMP was showing the kids as playing the tough ice time well into November. Of course, in fact, that wasn't the case.

Penner on the night: He was -1 in EV plus/minus, -6 in shots +/-, and -10 in corsi +/-. That fat bastard! He's just not trying! I blame MacTavish, he's not motivating him. Lucky for MacT the kid line is saving his stupid ass!


Saturday, November 14, 2009

The Oilers At Even Strength So Far

alt. title: This Probably Isn't Going To End Well

The information in the table below speaks for itself, though your definition of "playing to the score" may differ from mine. I've included the data for the first nine games as well, just to put an exclamation point on Tyler's Irrational Exuberance post of October 26th.

Friday, November 13, 2009

The Oiler Power Play and Scoring Chances

The Oiler PK and Scoring Chances

Sunday, November 08, 2009

Horcoff - Penner - Hemsky

There has been much conversation over the past year regarding this line. Derek Zona has shown, beyond a shadow of a doubt, that when they played together at even strength last season they got terrific results to go along with terrific underlying numbers.

So if you had any of these three guys in your hockey pool, you were justified in screaming at your telelvision whenever MacTavish broke up the line, which was most of the time. That's been established, I think.

The question is, how much were the other lines hurt by having most of the forward talent on one line? More specifically, we want to know the effect on the team as a whole.

To do this I wrote a script to go through the NHL.com time on ice sheets and find the games that Penner played most of his EV icetime with Horcoff and Hemsky. If that happened, the game's EV tied-score data was dumped into bin #1. If that didn't happen, and both Hemsky and Horcoff played in the game ... then the game's EV tied-score data was dumped into bin #2.

In the games with Penner-Horcoff-Hemsky as the top unit:
48.5% of the EV tied-score shots were owned by the Oilers.
48.9% of the EV scoring chances were owned by the Oilers.

In other games (83 and 10 both on the game roster):
48.0% of the EV tied-score shots were owned by the Oilers.
47.7% of the EV scoring chances were owned by the Oilers.

So MacTavish's decision to keep Penner off the top unit, probably due in part due to his personal distaste for Dustin, seems to have had a small detrimental effect on the play of the team as a whole. Not a hell of a lot, but more than I was expecting. And, going by some of the commentary that I have read on the subject, considerably less than a lot of Oiler fans were expecting I'm sure.

With the top line loaded up or not, this was just a slightly below average skating team at even strength last year. Shuffling players around the lineup in different ways only accomplishes so much.

UPDATE: I've added scoring chances as well and, following RiversQ's comment, it appears that we should expect shift in scoring chance percentage of greater than 1.1% (the amount that happened) about 6 times in 10 by chance alone. So the the impact on the team of playing Horcoff-Penner-Hemsky as a line is so small, in terms of scoring chances, that we can't see it through the noise. It may have even had a neagtive effect on the team's chances of winning.

More on Dubious NHL Shot Counting

Following up on the 'Shots On Net' post below, this is a chart showing the NHL's most dubious shot counters from the 08/09 season. Click to enlarge. Calgary and Edmonton are fringe qualifiers, but they are teams of interest around here, so I included them. All this information is for when the game is tied at even strength, to try and minize the noise.

By way of example, for Calgary:

On home ice the Flames NHL scorer recorded 50.5% of the Flames' shots-directed-at-net (Corsi) as shots on goal. And he recorded 58% for the opponent's shots/Corsi.

On the road, the aggregate totals of the NHL scorers in other rinks show the Flames with 53% shots/Corsi. And their opponents with a 61% shots/Corsi.

That's shown as a red line. The angle of the line indicates his home team bias. The nearer 45 degrees, the less that home team bias is indicated. So it looks like the person that the NHL employs to track shots-on-goal is a bit of a hard marker. But he's the same way for both teams playing on Saddledome ice. The shot bias is fairly significant. Some back of the envelope sums suggests a .004 or so hit on Kipper's save%.

The scorer at Rexall sees the game the same way as the Flames marker. The Toronto shot tracker is an extreme version of the same.

The Chicago scorer is the polar opposite, everything looks like a shot to this cat. But at least he seems fair to both teams.

The Buffalo and Tampa Bay counters show extreme home bias, they are flattering their goalies in a pretty significant way. I don't know how much .005 or .010 difference in save percentage matters come contract negotiation time, but it looks to me like befriending the shot tracker would be very worthwhile for an NHL goalie and his agent.

The Montreal guy/girl is tough when it comes to counting a Habs shot as being "on net", but he softens up when the other team's shoot the puck. I hope that the Canadiens goalies appreciate that.

Of course there is surely a lot of noise in here, teams like EDM and CGY may well just be coincidence, or at least the magnitude might be considerably less. The others though ... those are madass swings. Maybe somebody will look at past seasons and see if these tendencies persist.

Just generally, I think it is wiser to use shots% or shot ratio, instead of shots+/-, when looking at teams like CGY, EDM and CHI.

Friday, November 06, 2009

Outshooting in Toronto

Gabe Desjardins recently listed the five worst teams that outshot their opponents in the history of the modern NHL. He ranked the post lockout Leafs as #2 on that list.

It's worth looking into why that happened, beyond the obvious (poor goaltending and PK, shooters who couldn't finish, bad luck). How the hell did the Leafs manage to outshoot their opponents at even strength last season?

Teams play differently when they are leading than when they are trailing. We saw that in the third period of game 7 of the Stanley Cup finals. With the two goal deficit Detroit was pressing and taking more risks, Pittsburgh was sitting back and playing safe hockey. Now obviously this strategy would have been disasterous for the Penguins if they had played the whole game that way, they got badly out-chanced in the third period, and even more badly outshot (7 to 1) and territorially dominated (22 to 3 in terms of Corsi). They did manage a couple of odd man rushes, capitalizing on Detroit's risky tactics, but they didn't score on either, only got a chance off of one if my memory is right.

The most likely outcome of that period, given Bylsma's tactics, was 1-0 for Detroit, which is what happened. And that's fine, because they had the two goal cushion.

During a Hockey Night In Canada broadcast, late in a game in which the Oilers held a one goal lead, Marc Crawford remarked that they should play a bit more conservative because of the score "the third forward should stay above the puck". I should freaking hope so. Hitchcock would have the third forward so high in that situation that he wouldn't have been able to read the advertising on the boards behind the net. Clearly there is a difference in the way that coaches have their teams play to the score.

So lets look at the breakdown for the Leafs last season.

Overall, at even strength and with both team's goalies on the ice, they fired 1968 shots at the other teams goalies and their netminders faced 1916 shots. So they owned 50.7% of the shots taken at even strength.


At EV when the score was TIED:
656 shots for, 705 shots against.
48.2% of the shots. (About the same as the Oilers, and the Leafs play in a weaker conference).

At EV when the Leafs were TRAILING BY ONE GOAL on the scoreboard:
364 shots for, 314 shots against.
53.7% of the shots.

At EV when the Leafs were TRAILING BY TWO OR MORE GOALS on the scoreboard:
445 shots for, 359 shots against.
55.3% of the shots.

At EV when the Leafs were LEADING on the scoreboard:
503 shots for, 538 shots against.
48.3% of the shots.


So, the Leafs overall even strength shot totals, for and against, are not an indication of a good skating team. Rather they are a reflection of the fact that they:
1. Didn't play to the score at all when they had the lead.
2. Trailed in hockey games a lot more often than they were leading.

That might have something to do with Wilson's philosophy, we didn't see that in San Jose though. More likely it speaks to a team that had no notions of being competitive, starting right from training camp. If we look at the teams this year that are out of the playoffs and are selling at the trade deadline ... for the rest of the year they'll all probably stop playing to the score. Both by eye and by the numbers.

The Problem With "Shots on Net"

Shots on net is a terrific statistic, especially at the NHL level. We know that as early as the 1980s that the big three of Roger Neilson's "second generation stats" for individual players were:
  • ice time
  • scoring chances
  • shots on net[Team]
All broken down by situation, of course (5v4PP, 5v5, EN, etc).

So, after a game against the Oilers in 1989, Lindy Ruff would get a postgame scorecard showing his ice time in different situations, how the scoring chances fell when he was out there, the scoring chances for which he was more responsible, etc. He'd also get a grade for shots-on-net[NYR] and shots-on-net[EDM], again by situation. There would be a whack of other information on there as well. You don' have to be particularly clever to start noticing patterns.

The question is, why use shots on net instead of shots AT net. Or put another way, Corsi+ & - or Shots + & -?

Intuitively you would think that Corsi would be the better indicator of territorial advantage in a single game, or any small sample, like a playoff series. To illustrate that; in the first game of the season the Oilers widely outshot the Flames, but only narrowly outchanced them and narrowly outCorsied them. The Flames missed the net a lot with their chances. Simple as that. Still, some players are far better at getting their shots through on net than other guys, and surely that has value, and that player is more likely to aid in territorial advantage than a guy whose shots tend to whistle wide or get blocked a lot more.

On a team level, they are bunched closely together in terms of "ability to get shots through". Shots+ divided by corsi+. But clearly there are different abilities at the team level as well. The strange thing is that teams who tend to get shots through also had a strong tendency to see the opponents do the same against them. That's completely counterintuitive, so I checked just using data from when the score in the game was tied ... still there.

The relationship, by way of Pearson Correlation, r=.43. Coincidence is unlikely, and though it seems that 90% of callers to sports talk radio are amateur psychologists, I personally find it hard to believe that the ability of one team to get their shots through is having an impact on the ability of the opposition to do the same at the other end of the ice. So before we start digging into the VMM (Vulcan Mind Meld) effect of two teams, we should look for a boring, rational explanation.

The terrific JLikens showed us that there is very measurable home recording bias in 'shots on goal', at least when you look at a large enough time frame. This even though surely some of the scorers have changed over the years.

That's understandable. I'm sure I could find a video clip of a shot at net where the shooter tried to fire it through the defenders legs, the puck glanced off of the Dman's shin pad and then the goalie snagged it with a high glove save. Was it a save or was it going wide? If you think it was wide is it a block or a miss? We could run a poll and get a spread of results from readers here, and we could find another similar clip and do the same. It's likely that the same folks would repeat their opinions.

The one thing we would all agree on, however, is that the puck was shot AT net. So let's look at just the road games.


On the road the Oilers got 43% of their shots through onto the net at EV when the score was tied. Their opponents got 45% through. The league average is 45%.

Now do the same for the other 29 teams and plot it out. Completely random. If you can see a trend you probably should get a CAT scan. r=.03, which is closer to zero than you would expect from two random sets of 30 numbers.


On Rexall ice the Oilers got 2.5% more of their shots through, so did their opponents.

The scatter plot for all 30 teams on home ice is below. Click to enlarge, you should be able to read the team names with some effort.

The pattern is obvious, there is a Pearson correlation of r=.70.

The wildly generous scorer is from Chicago. This creates the illusion that Chicago was/is a shot-happy team. In truth, in terms of total shots AT net in the games (for both teams combined), they were nearly spot on the same as the Oilers. The difference is that the Hawks play a lot more in the offensive end of the rink than the Oilers do.

I hope that the coaching staff and management of the team you cheer for realizes this. The management of the team I cheer for recently signed Khabibulin to a big ticket deal and more recently educated the local media on how Quenneville's Blackhawks rack up gaudy shot totals because "they just shoot from anywhere".

The Puck Has To Be Somewhere

There is much talk about high event and low event players around the Oilogosphere. In truth there isn't much in it beyond coincidence driven by on-ice shooting and save percentages. That's why it is so volatile from season to season. Below is a chart with total shots on net that happened when a skater was on the ice at even strength last season, this for guys who played a regular shift. These numbers come from Behind The Net and display the Western Conference only. Click to enlarge. The numbers on the bottom of the chart are the total number of shots directed at net (by either team) per hour at 5v5 hockey, we'll call that 'Total Corsi Rate' for short.

Better players tend to be higher event for sure, just not by a lot. If we filtered out team effects, especially the way that teams played to the score, and also removed the last few seconds of ice time after a penalty expired, and accounted for the fact that some players start their shifts less in the neutral zone than others ... well this group would be even more tightly bunched. Some teams are a touch more shot happy as well, that contributes to the spread, just not as much as most of us imagined. Certainly much, much less than Pat Quinn imagines (unless he was blowing smoke earlier to try and make us believe that the Oilers really weren't getting outplayed and outchanced as badly as it seemed).

The Eastern Conference plays a more wide open, and generally more entertaining, brand of hockey. Their results are also tightly bunched, but shifted to the right.

The "highest event" Oiler by this metric is Ethan Moreau at 114 Total Corsi Rate.
He's just a couple of players away from Henrik Zetterberg who clocks in with a 113 score, and well clear of Rick Nash at 105.

Moreau also led the team in personal shots rate, but had a tragic Corsi +/- of -182, the worst on the team of the regular forwards, on a per game basis. There seems to be a pervasive line of reasoning amongst Oiler fans that players who aren't getting many shots just aren't shooting enough. In fact players who are shooting too much, especially from bad angles (outside the dots) are doing nothing to help their team out shoot the opposition. The opposite in fact. The puck just ends back up in their own end again.

And while a player like Cogliano doesn't get many shots, largely that's because he's playing way too much in his own end of the rink. Adding a guy like Moreau, who shoots a lot, to his line might seem like a fix on the surface, but it will just make things worse. The Total Corsi Rate is a persistent thing, it will stay about the same. The difference will be that even more of those shots at net in the game will be coming towards the Oiler goalie.

Good players tend to play more in the offensive end of the rink and to generate more chances, so their Corsi almost always looks good. And that tends to result in more shots for themselves as well. But the reverse is not true. i.e. Good EV hockey players, the type that really help you win, generally get a lot of shots at net at EV. But players that get a tonne of shots at net at EV are not necessarily the type that really help you win hockey games.

Jokinen and O'Sullivan are two of last year's poster children for the phenomenon. I think both are good NHL players, albeit overpaid. I'm not criticizing them personally. But if you rate players by the number of shots on (or at) net that they personally register ... you'll overpay for this type of player. Tambellini pursued Jokinen aggressively at the deadline according to Maloney, before being outbid by Sutter. And he was successful in landing O'Sullivan. I have this horrible feeling that Tambellini plays hockey video games, by the way.

I'm not meaning to pile on. The Oilers are ravaged with injury and sickness right now, and they had terrible roster depth to begin with. When/if their best players get fit and find their games again, the Oilers will win some games. For me, the big picture is more concerning.

Wednesday, November 04, 2009

Clutch, Baby!

In this post, I intend to use reason in order to make you believe in clutch hitting ability in major league baseball. Seriously.

Bill James, who is quite famous in the world of baseball, is an engaging writer and a critical thinker. Nobody has done more to get baseball fans thinking, though I doubt he lists math as one of his strengths. In 2004 he wrote a terrific article in The Baseball Research Journal which I stumbled across on Monday, I highly recommend it. Essentially he's questioning the validity of some of the methods used by most of the mathy baseball analysts of the world. If you consider his audience ... that takes some stones.

Not surprisingly, this piece generated a lot of commentary. Dr Jim Albert, who I think is the best baseball writer out there, chimed in as well. He offered fair, detailed and reasonable comments on James' article. The thrust of it:
"Although I agree with James’ general conclusions, unfortunately I think that he is unclear and sometimes wrong in some of his statements about chance variation."
There is a peculiar formality in the way that these baseball cats talk to each other on the internet, bless them.

After that the conversation drifts into a wandering discussion regarding the presence of clutch hitting in baseball. Prolific baseball writer Phil Birnbaum doesn't believe in clutch hitting ability at all, and he gathered a whack of clutch hitting stats to make his case. He used Late Inning Pressure Situations (LIPS) to define a clutch at-bat (AB). And he defined the clutchness as the difference in batting averages between LIPS ABs and all other ABs. So if a guy batted .340 in Late Inning Pressure Situations and .300 in other ABs, his clutchness result would be +.040. Easy as beans. That makes sense to me, so I'm going to use Phil's data and definitions for my kick at the clutchness cat, thats the LIPS and non-LIPS ABs and hits for 16 seasons. I'll sum each player's results from each season, so I'll be working with one set of clutch data for each of 553 MLB hitters.


Imagine that there was an enormous balls-up at the Elias Sports Bureau. A disgruntled employee has falsified the LIPS batting averages for all players. He's kept the number of LIPS and non-LIPS ABs the same for everyone, and he's kept the total number of hits the same for everyone. But every time a customer downloads data, the hits are sprinkled over the LIPS and non-LIPS ABs completely randomly. So in 1974 Ron Cey still had 491 non-LIPS ABs, 123 LIPS ABs and 114 total hits ... but every time I access the Elias data those hits get shuffled into the LIPS and non-LIPS ABs. The first time I check I see that Ron Cey was the clutchiest of the clutchy in 1974, I check an hour later and he was a disgraceful choker in 1974. What the hell?

I keep downloading sets of this random data, the same stuff Phil compiled. Every time it's different, and by the time I realize what's going on I've downloaded and saved a whopping 1000 sets of random data. This isn't much of a stretch by the way, I'm not a quick study.

As I have these 1000 random seasons and also the real clutch data, I may as well make use of them. I plot out a bunch of them as histograms (that's just a bar chart, each bar covers a range of clutch averages, such as -.040 to -.035). The result is always a squiggly bell shape. Not too exciting. The actual clutch histogram is also a squiggly bell shape. To be expected, the universe is a squiggly place. It's also a little off centre, LIPS ABs probably come against better pitching a shade more often than not. It also looks like the bell lists to the right a bit, we'll call that left skewed.

It turns out that the actual data is spread out wider than the vast majority of the random seasons. Wider than 931 of them, in fact, using variance as the measuring stick.

Variance is a simple measure. If, using all the data for these 16 seasons, Ron Cey has a clutchness of +.020, and the overall league average is -.005, then he is is .025 points from average. Square that ( .025 x .025. ), then the same for everyone. Take the average of the whole bunch.

Σ(abs(x-xo)/n is similar to variance, we just don't square the differences. We have to make sure they are all positive numbers though.

Σ(abs(x-xo)3)/n is as above, except we don't square the differences, we cube them. Again, we have to make sure they are all positive numbers.

Using Jim Albert's equation from the article above. The sum of variances of luck and ability distributions equals the variance of the actual distribution.

Hitting clutchness, as defined by Phil Birnbaum and using his data, was 10.4% skill and 89.6% luck.

That strikes me as a naive assumption though, nature probably hasn't been kind enough to distribute clutch ability in Gaussian (Normal Distribution) fashion throughout the rosters of MLB.

We can build our own model for ability, parlay it through the luck distribution (the average of the 1000 random seasons) and see how close we come to the actual, or observed, distribution that Phil provided.

Trial 1: Assume that most hitters have no clutch or choking qualities. Apply .010 of added clutchness to 100 random hitters. Deduct .010 of clutchness from 100 random hitters. Run 1000 simulations.
Result: It's an improvement over the assumption that no clutch ability exists, this by all three measures above, but not enough. We need to bump it up a bit more.

Trial 2: Same as trial 1 but crank it up to .020 points added or deducted to clutchness.
Result: It's an improvement over Trial 1 by all three measures above, we're getting close to the 50th percentile by all three measures above. But still not enough. We need to bump it up just a shade more.

Trial 3: Same as trial 2 but crank it up to .025 points added or deducted to clutchness.
Result: Now we've gone much too far. In around the 35th percentile range for the three measures I'm using.

Trial 4: Let's try .022 as an adjustment.
Result: Ah, that's the stuff. The result we create with the model matches all three measure are very closely. All measures would rank close to 500th when compared to the 1000 random seasons.

That's all that I have done. No more or less. From here out it's straightforward though, we can refine the ability distribution to give a perfect result if we try. I wouldn't bother at this point, though. Firstly because my implementation here was a bit heavy handed, clutchness should be added into the ability distribution in a different way. Secondly because the difference between the actual data and the random data could still be the product of randomness. Or, equally likely, clutch ability is larger than I'm indicating here. Randomness is the essence of the universe, after all. Best to run the same procedure on several different sets of data, methinks. We're still painting with a big brush at this point.

* I may well have made a mistake along the way, either in logic or in coding, so please do not use this information for the purpose of wagering.

* None of the 1000 sample seasons resulted in a wider spread of results by all three measurements (absolute average difference from the mean, average squared difference from the mean and average absolute cubed difference from the mean) than the actual results, though this may be due in part to the fact that overall the players averaged a -.0057 clutchness. I don't suspect that is fatal, but this offset does escape the Lutheran philosophy of the model.