Tuesday, May 05, 2009

Scoring Chances: Part IV of Many. Anomaly

If you select 38 games from the Oilers season, and then you compare them to 38 other games from the same season, patterns should emerge.

We expect things that are driven by ability to repeat, and using correlation is a convenient way to compare the two sets of numbers. In baseball, this simple repeatability correlation meshes really well with the more complex soultions to the problem of separating luck from ability. And the same should hold true for hockey or any other sport.

If you do this with on-ice EVsave% (using shots directed at net instead of shots on net here, for convenience). There is negligible correlation. 0.02. Nothing there.

So we would expect that the number of scoring chances against, per 100 shots directed at your net ... we'd expect that to not repeat at all. But it does a bit: r = 0.35. Go figure. It's not a big number but it's persistent. 1000 simulations is a lot.

And we would expect that the number of goals against, per 100 scoring chances against ... we'd expect that to not repeat at all. But it does a bit: r = -0.24. Go figure. It's not a big number but it's persistent. And it's negative. Meaning that if you were giving up more goals per scoring chance in one random half of the season (implying that you're giving up higher quality chances) it is likely that the pendulum will swing the other way for you in the other random selection of games. i.e. if a player is giving up more goals per 100 scoring chances in 38 randomly selected games, then there is a better than 50/50 chance that he will give up fewer goals per 100 scoring chances in the other 38 randomly selected games. That doesn't make any sense at all to me.

Use smaller game sets (20 random games vs 20 different random games) and the numbers get smaller but they still persist. r=0.13 for chances per corsi, and r=-0.13 for goals per scoring chance.

Maybe I've just made a mistake in my scripting, but I don't think so. And it probably doesn't matter much to end results. Fine brushstrokes I know. Still, there may be some truth about hockey, something that isn't obvious to the eye, living in this phenomenon.

I'm lost for an explanation.

I suspect that I am about to go on a posting hiatus, hopefully others pick up the scoring chance data from Scott and test some theories.


Blogger Vic Ferrari said...

I think I have it figured. The Oilers were blown out 3 games this year, BUF, CHI and DET. And in those games a whack of the goals were scored at evens, and on relatively few scoring chances. The opponents just made their shots on the night.

Anyhow, because of the way I did it (two populations) there had to be either 2 (75% of the time) or 3 (25% of the time) of those games on one side of the ledger, and 1 or 0 of them on the other side, respectively.

So those samples had roughly the same number of scoring chances against, but got a big bump in goals against from the blowouts.

Filter out those three games and there is nothing left in it. The goals/chance and goals/corsi don't repeat worth a damn.

The other graph isn't affected by the same phenomenon. The redline (relationship of scoring chances to goals) actually grows a bit stronger, goes up a couple of line widths. But it's not enough to bother changing it.

Carry on.

5/05/2009 7:16 pm  
Blogger PDO said...

And in those games a whack of the goals were scored at evens, and on relatively few scoring chances. The opponents just made their shots on the night.I was called an insane MacT apologist when I pointed this out to many...

Great work here Vic.

5/05/2009 9:47 pm  
Blogger Scott said...

Out of curiosity, does the data get even closer together if you take out the Oiler blowout wins as well? The games at Colorado, at Columbus and vs Montreal come to mind. Do you think it should? Or should these things work themselves out with other games that teams just don't make their shots?

5/06/2009 12:09 pm  
Blogger Vic Ferrari said...

I'm just looking at chances against here, Scott, so they aren't influencung anything.

As far as affecting the relationship of goals per scoring chance and goals per corsi ... blowouts (for or against) don't make a material difference. The graph below in "possession is everything" changes negligibly with these games removed.

But if you make two random sample subsets (over and over) and compare them, then you get the problem of three games influencing one subset heavily. Especially when all three blowouts happen to fall into the same subset.

That's why there was a persistent negative relationship with goals per scoring chance. That can't happen naturally in the world. Not unless all of our futures are predetermined AND God has a sense of irony.

5/06/2009 12:26 pm  
Blogger Scott said...

Thanks Vic. That makes sense. I hadn't read closely enough (obviously) but I still do wonder about the anti-blowout effect. Maybe the game Roloson stole in San Jose or shutouts or similar. Shouldn't those games even out the blowouts? It will still depend on where they fall in the random samples of course and I do understand that the odd number skews things but I would think the blowouts are just a run of luck that should even out through the rest of the sample.

If things don't really even out would we see that in other things? So, for example, goal differential being more predictive of results if games with a differential of 4 or more are excluded?

5/06/2009 6:21 pm  
Blogger Vic Ferrari said...

It's like rolling a die. If we look at stretches of "how many sixes can you roll in 20 tries", then the sixes are relatively rare, relative to the dice rolls. Five to one. And if I made you answer a trivia question before permitting you to roll, then bigger still.

You will have some terrific stretches of sixes in there, stuff happens. But it will be there in the right measure, in number and streakiness. And you will have had some bitchin' stretches, and some nightmarish ones, randomness be randomness, after all.

Now if I gather your results, and 10 or 20 of your friends doing the same, into a spreadsheet or array, and randomly group them into two bunches, well then if I'm looking at anything using goals as the numerator ... the ratios are going to be too high for one group, and by definition, lower for the other. Creating the illusion that dice gods:
1. Exist.
2. Punish previous success.

Really my original model wasn't very smart, If I was more clever I would have seen this coming. But there ya go.

On the general topic of the diminishing value of goals in a blowout ... your intuit is right. In hockey for certain and ,according to Jeff Sagarin, in all sports.

5/06/2009 6:54 pm  
Blogger Scott said...

Thanks Vic. The dice/trivia explanations always help me to sort things through, probably because I'm getting used to them.

5/07/2009 8:51 am  
Blogger Showerhead said...

Holy hell Vic, you've been on an absolute tear lately - great news and great reading for any mathematically thinking hockey fan. I can't imagine how different my perspective of the game would be had I never stumbled across good ol' HFBoards in the lockout season. This goes as a compliment to you as well as the usual suspects who always get thanked in praising posts like this one but also to the especially thoughtful people who have been replying in these last few threeads. Everyone has added, questioned, or clarified in some way.

It would appear that I just wrote the Oilogosphere equivalent of "All four lines are firing". Help me God.

Anyhow, one last piece of praise, a quick quote and then a quick question. I think my favourite hockey articles and discussions on the web are the ones I have the least to add to. A slightly ironic point of view for a blogger to take but hey. I don't mind quietly reading and learning - life is about balance and I'm loud enough in other places.

"It was one of those rare smiles with a quality of eternal reassurance in it, that you may come across four or five times in life. It faced - or seemed to face - the whole external world for an instant, and then concentrated on you with an irresistible prejudice in your favor. It understood you just so far as you wanted to be understood, believed in you as you would like to believe in yourself and assured you that it had precisely the same impression of you that, at your best, you hoped to convey.

Precisely at that point it vanished."
-from The Great Gatsby by F. Scott Fitzgerald. This blog may or may not have anything to do with smiling but Vic, you tend to disappear for a while after these great bursts of posting and so I was reminded of one of my favourite paragraphs in that book.

And FINALLY a question about hockey - how much do we prospectively have to gain from an actual time count of possession? Ever since I first played NHL 94 as a pup I have sworn by time of possession as the most important stat and I'm thinking that with PVR's and the ability to send large files quickly via the Al Gore we can't be too far away from real numbers. I'd even go so far as to volunteer for the stopwatch role if ever the need (and .AVI's arose.

5/07/2009 10:29 am  
Blogger Vic Ferrari said...

Wow, thanks.

I hope you, and other guys too, pick up Dennis' scoring chance data from Scott and have a go. Test a theory, crush a myth, whatever strikes you.

On the zone time thing, I'm sure that the NHL still records it, they just don't publish it. I remember Moores quoting it a couple of times during the season. Apparently the Oilers had crazy good 5v5 zone time in the first two periods of the game against the Habs in Montreal. Then again, Billy only ever gives us the MFDA (most flattering data available) though. If the scoring chances had outperformed zone time then ... that's what we would have gotten from him I'm sure.

What we're really looking for is territorial advantage, meaningful possession. And the shots directed at net thing seems to work awfully well, and also meshes with zone time (overall including special teams) very strongly in the years that they recorded it, or at least the one I checked. The post is on here somewhere.

Scoring chances are the bomb. And if other people really do start tracking them for other teams next year ... we'll really move this thing forward.

As a 'hard count' thing ... knowing how many offsides happened for each team in a game would enhance the corsi metric. I think that would be really valuable. Again, it would be easier if the NHL just published it (they note offsides in the PBP but don't tell us which team. they also record the position on the ice of every faceoff, but the publish only the zone) ... if they decide to give us either of those hen we can get that infor wihout actual work, which is always favourite.

Failing that, we could tally them at the same time as the scoring chances I suppose.

5/07/2009 11:32 am  

Post a Comment

<< Home