Monday, March 24, 2008

He Was Fuckin' Lucky


That was the line that changed the way that I looked at the game, almost exactly twenty years ago. Anyone can blow smoke out of their ass and act confident, but when someone pays you a whack of the cut (for skills you only pretended you had) and has a staggering amount of money on the line every night, well these are people whose opinion carries merit, whether they respect you or not.

I think, or hope, that I have a long history of not being an "I told you so" guy. Really it's just a matter of betting with the odds in any case. So long as you're more likely to win than lose, above the hold, it's a good bet.

So when I wrote last month that ... "I was going to write a bit on Mike Richards' fluky season, but there is a rookie defenceman on the Thrashers named Enstrom, and he clearly has made a deal with the devil. Both bubbles that just have to burst eventually."

Combined, they were EV+/- +15 to that point. In the six weeks since then they have been EV+/- -15. Neither is getting the love from TSN anymore.

Now nobody in the world could have pegged that exactly, but sensible people everywhere would have sensed that when the puck is in the wrong end that much when you are on the ice, as evidenced where the faceoffs ended and where the shots-directed-at-net happen, well that will end badly eventually.

They were fuckin' lucky. Simple as that.

You could do the same with a bunch of stuff that Tyler or slipper have written lately, they bet with the odds. Or, alternatively, if you are feeling like you have become their internet bitches, then you could search out the times when the cookie crumbled the other way, and berate them in a sad way of making yourself feel more important. More power to you, btw. It's the same to me either way.

.

Causation, a mathy aside: This has been bastardized recently on the blogosphere. We could look at EV+/-, and believe what Roger Neilson told us, that it's the principle driver of wins, and that's good. But when we have spotty college kids blindly applying principles of causation from correlation ... God help us all, it makes me want to shove a knife in my eye. For Christmas sakes, EV+/- is defined as EVshooting% * EVshotsFor + EVsave% * EVshotsAgainst (all while-you-were-one-the-ice numbers) ... the sum total of causation from these actual elements of the damn equation is around 25%.

Where does the other 75% come from?

I'll cut off the former goalies right here by saying that I can't actually disprove your argument that it comes from a parallel universe where everyone thinks like Glenn Healy and acts like Brian Burke, and that this information flows to us through the consciousness of Nick Kypreos .




The fact is, as the delightful graphics show: Overwhelmingly, the ability to control possession predicts future success at EV+/-. Over any given week or two, the shit-happens numbers, the bounces, they tell the tale. In the long run "playing hockey well" actually predicts future success, who'da thunk it?

Some of you are probably thinking that this should be too obvious to state. But it had to be said.

20 Comments:

Blogger PDO said...

Too obvious to be said.

But.

Are there any sites yet, that I don't know of, where I can easily find the EV SV% or SV S% numbers? They're ones that always interest me, that I can never find, creating a bit of a problem....

The biggest reason for this?

Mike Ribeiro has pretty much stopped shooting the puck all together, allowing his S% to stay nearly double his career average....

But the guy is still picking up assists at an ungodly rate, to the point that he's still 12th in scoring, and he will in all likelihood finish a PPG player. Looking around at his linemates, I don't see anything in particular that screams he's getting lucky here either....

3/24/2008 4:51 pm  
Blogger PDO said...

Sorry, that should say "EV SV% or EV S%."

3/24/2008 4:52 pm  
Blogger YKOil said...

Thank you for that Vic Ferrari. Thank you for that.

Great post.

But... ummm... put the knife away. We need you out here in the wilderness.

3/24/2008 10:43 pm  
Blogger Showerhead said...

Evolution, an inquisitive aside:* when you started upon using offensive minus defensive zone draws as a metric of team success, were you...

a) Looking for good indicators of future success and found a stat...

b) Looking at stats and finding a good indicator of future success...

or

c) Looking for a quantitative measure of where the puck is on the ice because it's mostly god damn obvious that possession of the puck in dangerous areas drives results?

*my imitation is absolutely not meant to be mockery: I find the questions you ask and your methods of approaching them to be among the best hockey reads available. I think my appreciation too should be obvious but I don't post much of substance these days and reckon my opinion on many topics isn't especially known.

3/25/2008 1:32 am  
Blogger Vic Ferrari said...

PDO:

The On Ice EVsave% and Shooting% are listed at timeonice.com

www.timeonice.com/teamshots.php?team=DAL&first=20100&last=20350

for example, will give you a bunch of stuff, pretty much the raw number version of behindthenet.ca stuff. The last two columns are what you are after.

That url would be for Dallas, games that number in between 20100 and 20350 from the NHL.

3/25/2008 3:03 pm  
Blogger Vic Ferrari said...

YKoil.

Fair point. I didn't realize that the tone of this post was so mean spirited until I reread it later. :)

Love your 'nothing to see here' blog btw, glad to see you're back to posting in recent weeks. Outstanding stuff.

3/25/2008 3:05 pm  
Blogger Vic Ferrari said...

Mostly C) and a bit of A).

I year or two ago a very intelligent stats blogger did a lot of math and deduced that the zone that a faceoff is taken in is immaterial to scoring. It struck me as madness, so when I saw they the NHL play-by-play sheets started listing every player on the ice for a faceoff, I decided to scrape it off.

Turns out there's even more value to it than I thought there would be. Players who are taking way more than their share of defensive zone draws are almost always struggling to keep their +/-, and their underlying numbers, in the black.

And teams that are good at it invariably have good shots +/-, Fenwick and Corsi as well. I mean some teams are a bit more shot happy than others, but the correlation between the two, at the team level, is still very, very strong.

just substitute "faceoffs" in place of "shots" in the URL above to get that stuff. or "xfaceoffs" to see where the shifts are ending.

3/25/2008 3:13 pm  
Blogger Oilman said...

I'll cut off the former goalies right here by saying that I can't actually disprove your argument that it comes from a parallel universe where everyone thinks like Glenn Healy and acts like Brian Burke, and that this information flows to us through the consciousness of Nick Kypreos

Ah ha!:o)

3/26/2008 9:17 am  
Blogger PerformanceOil said...

Vic,

Missed your last post on LT's blog, but had a hunch you might have posted there after reading this. I'll reply to both together.

Nice work avoiding my points in regards to stats in general (correlation's vs. significance, why the four values you list would have to have 100% correlation).

At least you aren't still using significance values as a measure of correlation; that's a step in the right direction.

As far as BtN's Corsi numbers:

First, you didn't mention the specific flaws in the LT post, rather you said they were Fenswick numbers "last time (you) checked." But hey, if you want to be a prick about it because you weren't clear (or maybe were confused?) that's fine. As for using your site, I tried. I got errors when I used Slipper's links. I could go to the home page for the site, but there are no links to the stats there.

Not that it matters, because as far as I can tell, BtN is correct.

Using Rory Fitzpatrick as an example, I went to his game-by-game logs, and compared one to the actual play-by-play file at NHL.com. The events were logged correctly. Next, I took the totals for his game-by-game logs, and manually calculated his rates for GF/GA, SF/SA, MF/MA. Finally, I added (GF+SF+MF)-(GA+SA+MA) to get a corsi number of -10.08. At BtN, his Corsi is listed as -10.2. Allowing for rounding variance, that's close enough for me. Blocked shots aren't even part of the equation (and adding them in will give a number far from 10.2).

So, in conclusion, the Corsi number for All-Star R.F. is correct. I have no reason to doubt any of the other Corsi numbers, so unless you have evidence that there is a problem, I'll just go on trusting BtN, thanks.

Again, using BtN I get the following correlation's (with min. games, 10, min min. 5, at the level of the player):
Corsi w/ GD: 0.18
Shot D w GD: 0.09
Fenwick w GD: 0.17
Blocked shot +/-: 0.07



None of these correlations are very good. I was shocked that Shot Differential correlated so badly, though given the seemingly random nature of shooting %'age, perhaps that makes sense. I would be interested to see what other people get using BtN's numbers, though I am quite certain my methodology is correct.

The only conclusion I can reach here is that all shots/shooters are not equal.

Now, your data:

the sum total of causation from these actual elements of the damn equation is around 25%.

I don't know what you are trying to say here. But, as far as I know, you can't use statistics to show cause, so you are incorrect. It seems you know that based on your rant in the previous paragraph, but I don't know how else to interpret this sentence. If you could clarify, that would be great.


Next, your graphs are interesting, although the fact that the data is reproducible doesn't have anything to do with correlation. One thing I am curious about: how did you calculate these numbers?

I have some problems with the second graph however.

The title is predictors of EV outscoring. This suggests you are trying to find correlations. However, based on your graph, EV +/- only correlates at a level of ~0.2 with EV outscoring. In my mind the two are synonymous, so this obviously makes little sense. Again, clarification would be great.

Anyway, if you wish to discuss this rationally, I will be happy to continue the debate. There may well be a problem with my data and/or methodology, and if you can point it out in a rational way, I am more than willing to be proven wrong. However, thus far you have shown you are capable of little other than condescension and parroting that you are right, simply because the facts are 'obvious'. Assuming that continues, I won't waste any more time debating with someone who is unwilling to reconsider their position.

3/26/2008 1:49 pm  
Blogger Vic Ferrari said...

"So, in conclusion, the Corsi number for All-Star R.F. is correct. I have no reason to doubt any of the other Corsi numbers, so unless you have evidence that there is a problem, I'll just go on trusting BtN, thanks."

Your conclusion is wrong. The error is that he's adding and subtracting the wrong things (or at least was a week or so ago when I checked you post). It's a syntax error, I make enough of them myself that I can't sensibly criticize. Plus Desjardins has added enough to the community that he deserves at least six or seven mistakes before being criticized.

As for you calling me a 'prick' for saying that for saying that Desjardins was using Fenwick in lieu of Corsi earlier, that's justy wrong. But if you still think it's an elaborate conspiracy to make you look foolish? ... Email Gabe and ask him.

And frankly, how could you have missed it? And why don't those links work for you? They work for me.

3/26/2008 3:24 pm  
Blogger Vic Ferrari said...

To add

Having read the rest of your post, Performance Oil, I admire your balls.

Get the pure and simple data, and show me the best way to predict future results. And I'm not baiting you, I'm looking for better answers, if they are out there.

I mean anyone can look at short term results and conclude that inducing your teammates to shooting at a better % than Lemieux in his prime, and inducing your your goaltender to stop the puck at a better clip than Hasek in his prime ... these are the keys to success!

Nobody will argue that, Performance Oil, it happens over and over and over again.

The questions are:
* Were they lucky or good during that stretch?
* ... and how much of each?
* Is it repeatable?

And if your conclusions aren't something that you would let money ride on, enough money that it would hurt if you lost ... then they just aren't good conclusions at all. Just aren't.

3/26/2008 3:55 pm  
Blogger Vic Ferrari said...

PO said:

However, based on your graph, EV +/- only correlates at a level of ~0.2 with EV outscoring. In my mind the two are synonymous, so this obviously makes little sense. Again, clarification would be great.


Sorry I didn't comment on this earlier.

The EV+/- from the first half of the season has a correlation of ~0.2 with the EV+/- in the second half of the season. In fact everything in the first graph is being correlated to the EV+/- for the second half of the season.

In the second graph everything is being correlated to the same stuff in the second half. So the EVsave% from the first half of the season was correlated to the EVsave%s for the second half of the season for the 30 NHL teams. Shots+/- from the first half to shots+/- for the second half, and so on for the rest.

The repeatability number in the second graph is identical to the first graph, of course, by definition. It's redundant, but I wanted to keep the graphs the same width without farting about, so I left it on.

3/27/2008 1:06 am  
Blogger Vic Ferrari said...

edit for above ... I've said "first graph" and "second graph" in reverse order.

3/27/2008 1:08 am  
Blogger Vic Ferrari said...

A quick addendum:

For the individual players (I just took the top 400 skaters in terms of shots-againmst for the first have, then took the top 300 of them in terms of SA for the second half ... hopefully most of them played similar numbers of games.

It is here: http://timeonice.com/EVPPR.jpg

Smaller sample size so Corsi > Fenwick > Shots+/-.

And of course the faceof zone thing has a different meaning. From a team's perspective it's showing you how many more times they started from a standstill in the offensive end of the rink. From a player's perspective it's showing you when the coach was sending you out there.

3/27/2008 11:31 am  
Blogger Vic Ferrari said...

One more thing, factoring in offsides would probably help too, if somebody is looking for a small project.

Not that offsides are good. But at least if you are going offside you know that you were either moving forward with the puck or trying to keep the pressure on in the offensive zone. And both of those are good things in the opinions of us all, I'm sure.

I doin't think the correlation will be overwhelming or anything, but it's independent enough of Corsi and Faceoff-Zone that if you added it to either it would surely make it a stronger predictor. Just add the offside+/- to either (you get a plus for going offside, and a minus when the opposition goes offside against you).

Or maybe not, just my sense of it. makes sense, no? Another thing that I'll get around to eventually if nobody else takes it on. Hopefully someone does.

3/27/2008 11:36 am  
Blogger PerformanceOil said...

Vic,

As for you calling me a 'prick'

In your last LT post you said:

Desjardins has an error in his script that reverses the blocked shot counts.

Pop quiz: Do you know why this is important.


and

I actually advised you of the things to look for, and where to get the data. You failed to do either.

and

in an effort to redeem yourself, you could do what I suggested Bruce do. It's not hard.

To me, that's being a prick about it, especially given that I had already done the things you suggest, but still disagree with you.

No matter. To repeat, I checked the data for one player, and everything is fine going from the play-by-play log to the final Corsi number. Sure, there may be other numbers which I didn't check which are incorrect, but I doubt you have double-checked all the numbers for all the players in your database. Furthermore, I say again, his current Corsi numbers (that I have checked) are not factoring in blocked shots, so mixing up BF/BA has no affect on the numbers I am using, except perhaps the Fenwick number.

Again, the numbers seem fine now - if there was a problem before, great. But it seems to be resolved. If you think there is still a problem, give me an example of a player who has error(s) in their numbers.

I mean anyone can look at short term results and conclude that inducing your teammates to shooting at a better % than Lemieux in his prime, and inducing your your goaltender to stop the puck at a better clip than Hasek in his prime ... these are the keys to success!

Can you point out where I said I think I have the key to predicting a player's success? All I have said is that Corsi is not a good predictor, and this is true. It doesn't matter why it is so (although understanding that would lead to greater insight I think). If someone gave you a Corsi number, you could not predict with any degree of certainty what the player's +/- would be. You could say that it is more likely to be positive if Corsi is positive, but nothing further. You seem to be stuck on the idea that I am trying to push some alternate theory or something, or saying that possession does not matter. I'm not. All I am doing is saying that based on the data, all shots are not created equally. The data also says that if possession is a predictor of +/- (not sure if it is from the point of view of the data, since I haven't looked), then Corsi probably isn't a good predictor of possession.

From where I am sitting, you seem to develop ideas and then look for evidence to support them. Then, when confronted with data that disagrees, you talk down to people and say that the game is simple and it is just common sense. Unfortunately, if the numbers disagree with you, no amount of gnashing your teeth changes that. Rather than wasting time trying to defend a flawed assumption (Corsi is a good measure of player success), you would be better off trying to explain why Corsi and especially shot +/- are such poor predictors of +/-, which seem counter-intuitive.

3/27/2008 4:33 pm  
Blogger PerformanceOil said...

I meant to cover this as well:

The questions are:
* Were they lucky or good during that stretch?
* ... and how much of each?
* Is it repeatable?


How do you define luck? Something that appears random, isn't necessarily so.

Take Stefan's infamous open net miss. Most people would say that was horrible luck. However, was it simple chance that caused the puck to hop on him? No, it was the poor ice. It is quite possible that if you placed Patrick Stefan in that exact situation 100 times, the exact same outcome would have occurred all 100 times. Likewise, if you put any other player in the exact same circumstance, I think it is likely that some other result would have occurred. That is not chance, that is a deterministic system.

The problem is, we can't do those experiments.

You keep talking about the reproducibility (or lack thereof) of data as having great meaning. It has none. It is a given that the data will not be reproducible, since the system and its inputs will never be identical again, nor even close to identical. The good players will tend to do better over time, and the worse players will tend to do worse. You can try to find measures which indicate a good player, but unless they are able to rise above the apparent chaos of the system (which is huge in hockey, there are simply a vast number of variables which can influence the results of a game to some extent), they have no value, whatever common sense may seem to say.

Corsi may be the single best predictor of a player's +/-, but it is not a good predictor in itself. This suggests that there are either A) Many other variables that influence +/-

or

B) There are only a few other variables, but they are larger factors.

I favour 'A', which you would probably refer to as luck.

3/27/2008 5:08 pm  
Blogger Asiaoil said...

This thread is f'ing hilarious and gets to a point that is obvious (to use Vic's word's) to anyone who has taken more than an intro stats course - thank you performanceoil.

The level of analysis used by Vic, MC et al. is simplistic in terms of predictive capability....period. Some interesting descriptive analysis than can help understand what happened in last night's game - but pretty limited in terms of predicting what will happen tomorrow. That's just the way it is with the simple tools applied by Vic and MC.

Instead of accepting this "obvious" truth - MC and Vic et al. usually resort to crude putdowns and mock arrogance when confronted with reality that does not agree with their analysis. Performanceoil is more than just "ballsy" Vic - he is clearly quite capable and you would be wise to have conversation that doesn't start and end with put-downs. You might actually learn something and stop pretending you can predict next season's results with a few faceoff numbers pulled off NHL.com. A bit of humility in the face of abject failure might also be useful - but it's just so much easier to invoke luck instead considering that you guys are not quite as smart as you think you are.

Game 80 - Oilers in 8th place - deal with it.

3/30/2008 6:51 am  
Blogger Vic Ferrari said...

Asiaoil said:

The level of analysis used by Vic, MC et al. is simplistic in terms of predictive capability.


That's not right.

Game lines are the benchmark, the frequentists like 'Performance Oil' (who I like, btw) may well make arguments that strum your strings, but you'll lose money hand over fist if you follow that reasoning.

MC79 and Jeff Sagarin are the only mathy guys inside the posts, from what I've seen (and I only started tracking Sagarin's hockey stuff since the Oilers SE conf swing, Sagarin was wrong with the results [dice will roll the way that dice will roll] but right with the odds in terms of scoring chances). Doubly odd tr Tyler because lawyers are notoriously terrible bettors, having said that, the hard science guys would be worse, but they never have the stones to wager.

I like you AO, always have. You were a goalie back in the day, no? You've written good stuff and added value from what I remember. You seem to be going a little bit self righteuos/nutty lately though, at least to my eye. Hopefully that passes.

3/30/2008 8:06 pm  
Blogger Vic Ferrari said...

Performance Oil:

Stick around, you're interesting. Though I can't really follow the thread of reasoning in your argument above at a quick read.

As far as "who is more of a jerk", which is off topic, still, I'll concede that you probably have a gentler soul. Bully for you, go build yourself a statue.

And if we apply your reasoning above to a board game involving trivia knowledge and dice rolling with weighted dies ... then before you can say "I've just shown that thinking positively while you roll the dice is the principle driver of results" ... slipper owns your house.

The world is beautiful. It really is.

3/30/2008 8:19 pm  

Post a Comment

<< Home