Friday, May 15, 2009

The Impact of Puck Possession and Location on Ice Hockey Strategy

That is the title of this article by Andrew C. Thomas, a must read.

The author followed the Harvard varsity hockey team for a season and recorded some terrific data on possession and zone changes during 5v5 hockey. He then related these to goals scored.

This meshes extremely well with NHL data, but gives a finer look at the 'hows'. In terms of strategy, he found that 'dump and chase' was equivalent to carrying it over the blue line going forward (in terms of expected +/-), and that the same holds true for chipping the puck out of your own end, as opposed to trying to pass/skate it out. This suggests that the ivy league is well coached, and that the players, on the whole, are staying within their roles and abilities (this is my conclusion on the strategy matter, not his).

You can skim over the mathy bits if that's not your thing, as they aren't particularly important. The general idea of still keeping events on the board, even after others have happened ... that's very wise. So if a team gets the puck into the offensive end, and then takes a shot and loses possession, forces the defender into lobbing it back into the neutral zone ... etc, etc ... and a goal is scored 32 seconds later. He would be counting forward from every event to the goal (or to no goal if none were there), as far as 40 seconds. The thinking being that one event (such as getting offensive zone possession) affects the myriad possibilities to come, even if the chain of events doesn't point to that play directly or intuitively.

That's called a Semi-Markov process. Which would make Roger Neilson's stuff from 20 years ago Benjamin Button Semi-Markov (BBSM), where he was looking at the scoring chances and working back through time. The acronym would be better if Brad Pitt's character had been named Benjamin Dutton, but you can't have everything. In fact I would guess that there are only a couple of degrees of separation from Neilson to the person who advised this author on the methodology (the Harvard coach or video coach, maybe? Or an alumni from the professional coaching fraternity?). In any case, by and large he's looking at the right things here. Terrific stuff.

A strict Markov chain is different in that it makes one event alter the nature of the future events, disregarding the events that preceded it. Such as when the score in a hockey game changes from tied to a one goal lead in the early third period ... conventional coaching strategy causes the expected number of total scoring chances to drop from that point forward, and the probability of the trailing team outchancing increases. The score from the first period doesn't affect this a fig, that's the distinction. Or at least that's the way I understand the Markov and Semi-Markov chains. The way of thinking is more important than it's name, in any case.

There is much to be gleaned from this, but the big apple near the bottom of the tree is faceoff data, so I'll grab just it for now. Starting in your own end is a killer; even if you win the draw (much better than losing it, obviously) the prognosis isn't good for the rest of the shift. We all know that I think, but this puts a 'how' with a 'how much'. And the 'how much' is spot on the same for the NHL, or at least for the first half of 07/08, which is all that I checked it against.

Interesting to note also that he acknowledges George Lindsey, an amateur baseball stats pioneer in the 1960s, for his advice. Lindsey's work in baseball is rudimentary by comparison, though it makes me wonder if he would have gone this route with his analysis if he had stuck with his baseball stats hobby. It would have been a monumental task if he had taken it on, baseball is complex.

12 Comments:

Blogger Lowetide said...

Exceptional stuff. Thanks Vic. I wonder if we can put some kind of governor on own-zone faceoffs to level the playing field? I mean this in the context of creating a number (or series of numbers) that can have some kind of meaning across seasons.

Or are we already there with GF/GA ON 5x5 with two columns (one for own zone and another for offensive zone)?

5/15/2009 10:31 am  
Blogger andy grabia said...

Thanks, Vic. This is super duper boner-riffic!!!

5/15/2009 11:26 am  
Blogger Vic Ferrari said...

Matt did something similar earlier in the year at BofA, entirely fairly I think.

So that's certainly doable, and would surely impress a lot of people, I'm not convinced of the value. A 5v5 shift starting in your own end results, on average, in two extra on-ice goals against for every goal-for. Same in the ECAC, as you've just read. Tonnes of other good stuff in there as well, of course.

And the shots metrics follow with that, though a bit better save% behind you in your own end and more blocked and missed shots. Players are in position after all, for all the badness associated with starting in your own end of the rink ... breakaways and odd man rushes aginst, they aren't on the list.

I've been reading a lot of baseball stuff recently. And for me it's really driven home the danger of chopping out nuance for the sake of grand, higher correlating, all encompassing stats. Catch-alls like BaseRuns or RunsCreated, OPS, wOBA ... the list seems endless but that sort of thing.

I mean the guy who is sent over the boards for a lot of extra own zone faceoffs (say Horcoff). And should he get the puck going the right way through the neutral zone ... he's also going to be short shifted for one of the many young Oilers who are only good in one end of the rink right now.

Plus, #10 is also the guy who's going to come over the boards to break the pattern of abuse if the Oilers have been penned i their end for a couple of shifts.

So while adding .35 or so to Horcoff's corsi number accounts for the faceoffs ... players with that skill set often do the other two things as mentioned above, so a best fit is going to be close to double that on average, more in the case of Horcoff, Zetterberg, Legwand, etc.

And we're still undervaluing Horcoff and flattering a guy like Malhotra, who has had a fairly defined role with own zone faceoffs over the last couple of years in Columbus. But Hitchcock has a deeper forward roster, so he doesn't get the tough icetime through on-the-fly changes or most other situations.

If Bill James has taught us anything, it's that magic bullets sell like hotcakes, and that there are no magic bullets. And I doubt he feels conflicted.

5/15/2009 11:44 am  
Blogger sunnymehta.com said...

Stats like wOBA are accurate in explaining the contribution of a player's results. Thing is, they don't really tell you shit about a player's skill set and therefore don't have much predictive value in and of themselves.

The nuts and bolts behind, say, a batter's skills are in the stats like BB rate, K rate, various power metrics (ISO, HR/FB%), etc. After you look at those and maybe check out nuances like GB/FB/LD profile (to get an idea of expected BABIP etc), there ain't much left to stand on. You have a pretty good idea of what type of batter you're dealing with, and what range of results you might expect.

Accordingly, I commend someone like Alan Ryder who attempts to accurately quantify NHL results with systems like his Player Contribution. But I've always equated that type of analysis to wOBA and never really saw the predictive value in it. Imo we still haven't locked in on the skill set part yet. Guys like Vic are heading in the right direction just by constantly being so aware of predictive value and knowing how to test it.

From what I've gathered through all of our recent discussions, the important skill points seem to be:

controlling territory, and using territorial advantage to create scoring chances.

To me, finding metrics to evaluate how skaters contribute to the above two points are going to be our BB rate and K rate. Our nuances will be things like limiting scoring chances when being dominated territorially (i.e. "shot quality"), and adding a little boost to goal probability within a scoring chance (i.e. "finishing").

And of course, for goalies, since they have little to no say in the controlling of territory, their main function is to limit the probability of scoring chances, when they occur, becoming goals. I'd assume that save percentage will have a lot of say in that, but who knows, maybe there are other nuances (positioning, stick handling, etc).

5/16/2009 1:13 pm  
Blogger Showerhead said...

What a great piece! If you find more work of this quality, please by all means post it as well. (Obviously, I suppose!)

A lot of interesting points, but here's what my focus is on:

1) "At 40 seconds, the scoring rates beginning in each state are nearly identical".

Who was it that used to do a lot of work analyzing events as many as a few shifts before a goal? I seem to remember it as Roger Neilson but either way, I wonder what he'd think of this report.

To me, 40 seconds is a fascinating number - probably very close to the average NHL shift length, though I don't know about the ECAC. But really, I'm equally likely to score 40+ seconds from now regardless of where the puck is and who has it? Kind of makes not getting scored on in the short term when maybe you should be scored against (I'm thinking posts or unbelievable saves/misses) seem like a very big deal. Obviously he hasn't taken a sample of scoring rates in situations so specific as "guy hits post on shot from the slot" but to me any action such as a post, save, miss, or whistle that effectively "re-sets" scoring probability from very high (chance in slot) back to what is typical of game state (one or another team has the puck in your defensive zone) is a very big deal. To me, this speaks to the fact that luck is a big part of the game.

Also, how does the 40 second number feel at first glance to the rest of you? My first thought was something like "wow, hockey is a fast game" but I'm curious to what other reactions are.

And lastly, I wonder what this does to the "YOU ALWAYS SEE GOALS AT ONE END AFTER CHANCES AT THE OTHER" idea of hockey commentating.

2) "The scoring probability for “retreat” is higher than its counterpart, “pursuit”, and the scoring probability for “dumpin”, offensive non-possession, is higher than that for defensive possession. However a team employing the dump-in strategy has a greater probability of scoring, and a lesser probability of being scored upon, than a team retreating." What I understand from this is that it's very important to have the puck, but even more important for the puck to be in the good part of the ice. A large part of my new respect for Mike Babcock is how many times he's said "we need to spend more time with the puck in their zone" or something to that effect. It's simple, it's fucking true, and it doesn't pay credence to the "Detroit has built a unique style called puck possession" idea. If you have good players, they'll spend time with the puck in the other team's zone. And given enough time, they will score. What else is there?

5/22/2009 1:11 pm  
Blogger Showerhead said...

3A) Comparing “dump and chase” vs. “carry in” strategies. The former yields a defensive bonus, the latter an offensive bonus; however, neither strategy is preferable in terms of total scoring advantage.

I think this statement is important but should be taken in context. There are game states more specific than the scope of this article that DO make dumping or "retreating" a more desirable strategy. Zetterberg at the beginning of shift? Probably best to retreat. Abdelkader at the end of his? Dump it in and get off.

While that example was specific to player quality, there are examples specific to player type as well. In the long run, A center with two Ryan Smyth's on the wing probably has more success playing dump and chase than if he has two Ales Hemsky's.

So while the scope of the quoted statement does need some qualifiers, I wonder if it might be put to some good use in other ways. Is "dump and chase" vs "carry it in" quantifiable binomially from team to team or coach to coach? Does this correlate with "button down" vs "run and gun" styles? It seems like the +/- of each is likely to be the same where one style would be higher event than the other. I'm sure the NHL isn't as black and white as this but I wonder if frequency of dump and chase vs carrying the puck in says anything about coaching style.. as opposed to being a reflection of what type and quality of player the team happens to have.

3B) "Comparing “clear” vs. “press” strategies while on defence. Neither strategy is generally superior to the other, but “clear” is better defensively while “press” is better offensively." It seems here we have another statement to which context needs to be applied. Jason Smith better chip it out but Joni Pitkanen, maybe not. Also, who are your wingers? How long have you been stuck in your zone? etc etc, but I don't think anyone reading this site needs to be convinced of contextual importance.

Anyhow, overall, I wish there were more comments to this post! It was a simply excellent paper and probably fairly accessible too, despite its academic tone. Thoughts?

5/22/2009 1:12 pm  
Blogger Showerhead said...

And finally:

If Bill James has taught us anything, it's that magic bullets sell like hotcakes, and that there are no magic bullets. And I doubt he feels conflicted.
Well. Fucking. Said.

5/22/2009 1:18 pm  
Blogger Bruce said...

This comment has been removed by the author.

5/22/2009 3:32 pm  
Blogger Bruce said...

I wish there were more comments to this post!

OK, since you asked, Showerhead ... an IOF lurker emerges.

Thanks to Vic for the link, that was a most interesting, if dry, read.

how does the 40 second number feel at first glance to the rest of you?

In general I'd say "sure", that a close play at one end is more likely to result in a goal at that end in 5 or 15 seconds and not too likely to impact after 40. This data appears to support that rather fundamental observation. As you say, hockey is a fast game.

I would however be interested to see further information as to game states, specifically score. i.e. if Team A is trailing in the third period they are more likely to be generating the next shot/scoring chance -- although not necessarily the next goal which is the focus of this fine paper.

A 5v5 shift starting in your own end results, on average, in two extra on-ice goals against for every goal-for.

I think I'm following you, Vic, though I might word it differently. Certainly the ratio appears to be 2:1 against for D-zone draws, but for A given shift the numbers would be tiny fractions of a goal.

Using the figures on Page 14 of goals within 40 seconds from a draw in Team A's defensive zone, and assuming a 50/50 split of the actual draws, I derive:

Team A expected goals
(.0123 + .0149)/2 = .0136

Team B expected goals
(.0191 + .0378)/2 = .0285
Seems a little smaller than I would have guessed. Crudely stated, I would expect the percentage of ES goals after all draws to be the number of ES goals per game divided by the number of ES faceoffs per game, with a significant downward adjustment for that percentage of ES goals scored after more than 40 seconds of continuous action, and a smaller upwards adjustment for the fact the faceoff in one end does slightly favour an offensive outcome. I still would have guesstimated something north of 5% of any ES faceoff winding up in one net or the other. I have no trouble accepting the 2:1 ratio for goals off faceoffs in the attacking zone, but I would expect more goals period than that table suggests.

Perhaps I am misinterpreting those numbers, and the expected goals are listed as a percent of ALL draws, not just those won or those lost as the case may be on a given side of the table. In which case I should dump the /2 from my equations, the expected goals double to 5.7% for Team B and 2.7% for Team A. If that second interpretation is correct, the ZoneStart factor is around -.03 per draw. The effect on a guy like Shawn Horcoff with his 156 additional defensive zone draws, would be an expected ZoneStart effect of -5. Seems intuitively about right.

Vic, I wonder if your data stripping capability enables you to comment on the duration distribution of the continuation of play prior to goals. What percentage of the time is it longer than 40 seconds?

the important skill points seem to be:
controlling territory, and using territorial advantage to create scoring chances
.

Sunny: Agreed, but you could just as easily be describing a chess match. Is that really what hockey has become?

That said, there are different ways of controlling territory: different spaces on the chess board have different values. A defensive team playing a certain system may tend to collapse down low, giving up outside territory, and shots, in the interest of closing down the slot and the higher quality shot. Another team may place a higher emphasis on preventing entry to the zone, play more aggressively to the puck, etc. Thus Corsi, while probably the best we've got, remains an imperfect indicator of attack zone time, possession time, and (quality of) scoring opportunities. It tells us quite a bit, but it's no magic bullet. :)

5/22/2009 3:53 pm  
Blogger Brian said...

How much of the result on carry-ins vs. dump-ins is merely a reflection of the fact that more offensively talented players tend to carry the puck in?

Moreover, this would be at least partially determined by the score. Teams protecting a lead will frequently dump the puck in without any real intention of retrieving it, but instead use the opportunity to set up a defensive formation, lessening both goals for and goals against. I have to think that controlling for dump-ins which are earnest attempts to gain offensive-zone possession would lessen the difference. The same issue exists with clearing vs. pressing. Team that are trailing with frequently try to carry the puck out of the defensive zone in situations where, normally, they would settle for clearing it. Since this, obviously, will result in more turnovers and more goals against, the figures will be skewed.

Contrary to Vic, my assumption is that a team "staying within their roles and abilities" will, all things being equal, see a better +/- when they carry the puck out rather than clear. However, game situations can necessitate that players operate outside of their normal roles.

5/23/2009 1:29 pm  
Blogger R O said...

Teams protecting a lead will frequently dump the puck in without any real intention of retrieving it, but instead use the opportunity to set up a defensive formation, lessening both goals for and goals against

That seems like the author's point though. Dump-ins results in less GF and less GA but the decrease in both is the same, resulting in the same +/- as a carry-in.

5/23/2009 9:30 pm  
Blogger Brian said...

That seems like the author's point though. Dump-ins results in less GF and less GA but the decrease in both is the same, resulting in the same +/- as a carry-in.Well, yeah. I’m just saying that all dump-ins are not created equal, and determining the +/- of dump-ins in general does not necessarily provide a solid guide for using the strategy in specific instances. If the study had been restricted to dump-ins accompanied by an aggressive forecheck, it would surely have found less of a decrease in goals for and against. Perhaps the +/- of aggressive dump-ins and passive ones would turn out to be the same, but I see no reason to assume as much.

In the interest of objectivity, the author has refrained from making evaluations on the intent of a team in performing actions, sticking to things that are plain to see and largely beyond debate. However, I think that sort of subjective interpretation is needed for the findings to be truly useful.

5/24/2009 2:17 pm  

Post a Comment

<< Home