Tuesday, January 19, 2010

Getting Used To It Now

From the media scrum after the Oilers' bag skate today:

I remember my first year here skating without the pucks, I was really surprised. But you know this is my fourth fifth year so I wasn't really surprised today.

-Ladislav Smid


That kind of sums up the Lowe/Tambellini era I think. They're used to it now.

Saturday, January 16, 2010

Likelihood and the Way Humans Think

This is at the crux of sport's most heated (and entertaining) online debates. The Internet is good that way, because surely we all have the good sense to avoid arguing about these things in real life. It also separates two types of thinkers, and they will never cross the line to join forces with the folks who have the different mindset, so it is an endless repetition of similar arguments.

Likelihood, and the way we understand it. That's at the core of almost all of it. So I thought I would write a blurb on the subject, as a point of reference for the future.

--------------------------------------------------------

Think of an imaginary hockey player that scores 16 EV goals on 103 EV shots, this during his rookie season. That's a 15.5% clip. How good is he at finishing his chances?

Thinker A says: He's a 15.5% shooter at evens, that is precisely what the evidence tells us. And with finish like that, he should rightly get more powerplay time and EV icetime in his sophomore year, meaning more goals.

Now Thinker A doesn't expect him to shoot EXACTLY 15.5% next year. Hell, a goalpost or two here, a big save there, a soft goal allowed by the opposing goalie here, it can make a hell of a difference ... it's a fluid thing. But 15.5% is the best guess.

A university student, in the natural or social sciences usually, will often bust into the Internet conversation with some math to support his Thinker A cohort. He will use precisely the thinking, and equation, explained here to calculate the likelihood of this rookie getting this wonderful shooting% while only possessing average shooting ability. Then he'll do the same for a player with 24.5% natural ability, and 23.5% natural ability ... and so on and so on. Plot them all out and the likelihood of the player's abilities will be expressed in the red histogram (ignore the clear one for now):



That's the graphic representation of what Thinker A was feeling all along, so he embraces it. And really, it is intellectually honest. He's looking at the player in isolation. He'll also find ways to rationalize it. He's a brilliant shooter, he doesn't waste any, he always gets in close for his chances. Some of that may be true, in my experience most of these arguments are developed after the fact. I doubt that Thinker A types are compulsive liars, they just embrace the anecdotal evidence that supports the way their brain works, as visualized in the chart above.

Now Thinker B enters the conversation ... this is always trouble.

He reads the discussion and can't remember the player having these special qualities. Then he looks and sees that the rookie never scored at will in either junior or college.

Then he looks at how shooting% shook out for the forwards in the entire league last season. It's a pretty wide spread of results (about half agains as wide as the clear histogram in the plot above).

He also sees that a whack of guys shoot 15-16% at evens every year, but nobody has ever maintained that clip for their career (the clear histogram in the plot above is his sense of how career EV shooting% is shaking out). And that includes some guys with tremendous, proven finishing ability. It's a real oddity for someone to even manage it two years in a row.

That's his feel of the way EV shooting% works in the NHL, with some guys being clearly better than others in the long haul, but wild swings for most guys from year to year, and huge spreads from good to poor on a yearly basis ... he surmises that buddy is due for a fall, that he just had a bit of a lucky season with the goals.

So Thinker B looks at the Thinker A's position (the red histogram in the top plot) and says ...

"Okay, I can understand that to an extent, but you're saying that this rookie has 1 in 6 chance of being a 15-16% guy by ability, and a 1 in 12 chance of being a 11-12% guy ... twice as likely to be in the high group, that seems fishy to me."

"In the league, we have a pretty good idea that there are 20 times as many 11-12% forwards than there are 15-16% guys, in terms of long term ability ... shouldn't we take that into account? So instead of our rookie being twice as likely to be in that 'super' pail than the merely 'decent' pail ... isn't he actually 10 times LESS likely to be in the 'super' pail than the merely 'decent' pail?"


This is usually the point where the chasm between the two mindsets becomes obvious and unbridgeable. Name calling generally starts about here as well.

If thinker B progresses with this thinking, multiplies each red column with each clear column, he'll end up with the bold clear histogram in the plot below. That's his likelihood estimation of the players ability. And it's a hell of a lot different than thinker A's.



If you use B's histogram above, as his estimate for the rookie ... now do precisely the same for the other 250 or so forwards who registered enough shots for you to count. Add up all your histograms to build one giant one (feel free to use Lego) ... you're back to where you started with the original clear histogram in the first picture. At least you will if that clear histogram was right in the first place.

Now, let's simulate a season. If you take one lego block, at random, from the histogram for each player (let's say you pulled a 10% lego block for the rookie) now grab a die weighted to 10% and roll it 103 times (the number of shots he had). Make sure to record the number of sixes you roll, mark that down as his simulated goals for that season, and figure out his shooting% as well. Now do the same for every other player ... plot out the results. Voila! The same as the spread of results for an actual NHL season.

Simulate another season and look at the way the individual player's numbers shifted from yearto year. Some guys had madass swings, some guys stayed the same. And the real league numbers show a frighteningly similar pattern. The real world numbers will be ever so slightly wider shifts from year to year. Why? Because things like wrist injuries, playing with Thornton, emotional problems ... they really do happen, and they affect a player's EV shooting%. But the effect on the league as a whole ... it is a fraction of what hockey pundits are attributing to the variance for even one team. So the overwhelming majority of this expert insight must be completely untrue. The universe demands that.

If thinker A tries the same thing with his plot (the red one, his sense of likelihood) if he applies it to every player in the league and adds them all up ... the spread is way too wide. He has estimated tonnes more terrific shooters than the NHL produced, and tonnes more terrible shooters as well. Keep parlaying it through and in relatively few seasons you'd have a population with a bunch of guys who didn't have a hope in hell of ever scoring a goal, and a smaller group who scored on almost every shot they took.

That's why A scoffs at the notion of predictive value, and will almost never wager with B.

Now obviously very few arguments on the Internet involve that kind of math, the language used is more likely a mixture of English and math, mostly the former. The format remains the same though, regardless of whether the topic is clutchness or health or save percentage or shootouts or whatever. If google had an English2Math translator ... you'd see that a lot of guys who aren't especially mathy (slipper and Tyler come to mind) are actually often hitting us with some big, Bayesian concepts. Translated to math, they would be brutal to solve, you'd need someone with much better math skills than me.

B types often bust out the phrase "regress to the mean". That's probably a poor choice of words. It creates the impression that there is an invisible force in the universe pulling everyone and everything towards mediocrity. Really we are talking about luck driving results in the short term, but the bounces being more likely to settle out with time. There is a good chance that a forward who has completely average results over three seasons ... if he has a hard shot and quick release, he will probably get better results in the future, he will probably move AWAY from the mean. The population as a whole is regressing to the mean with time, the players are bouncing around every which way at any moment.

That is it. I don't think I have ever seen a Thinker A type convert to a Thinker B type in my life. The reverse has never happened either. Not in sports talk, not anywhere. I have to think that it's just the way our heads are wired from birth or early life. The Oilogosphere is swimming in this Thinker B type of writer and commenter, whether they are mathematically inclined or not. That is an extraordinary thing, at least to my mind.

NOTE: The clear histogram in the top picture is actually an estimate of the ability of finishing in the population. You can think of each piece of it as the average of each player over a million parallel universes, it all adds up to make this plot of shooting% ability for the NHL forwards. 'Ability' is the common term for these prior distributions in sports, 'Non-Luck' is more correct to my mind.

Quality of Teammates

A lot of people who read here are familiar with the notion that the context of a player's ice time has a huge impact on their counting stats and how good they look on the ice. By context, largely we're talking about who they are playing with, and the quality of opponent that they are generally playing against. Also, where their shifts are starting more, be it by faceoff location or on the fly. And lastly, whether or not they are being run out against opponents with tired legs. Those are the principle elements, anyways. And they aren't always easy to pin down with numbers.

Nobody has done more to popularize these ideas than Gabe Desjardins at his stats site. The faceoff zone data can't be argued with, and the quality of competition information always seemed reasonable, given enough games. The quality of teammate data, however, always seemed veryy dubious to me.

That's changed, Gabe has a new quality of teammate stat that is based on a player's underlying stats, specifically his Corsi number. The results seem completely sensible. They are presented in the chart below. Click to enlarge.

The old QUALTEAM numbers seemed madass to me, but I can't argue with these, they mesh with what I've seen in the games. Your mileage may vary.

Friday, January 15, 2010

Al

During the lockout I heard Darcy Regier interviewed on the radio fairly often. He was at the forefront of the rule change initiatives, so he got quite a bit of air time.

I remember an interview when he talked about Al Arbour's influence, what follows is just by my memory. In his story he asked Arbour 'what were the keys to building a contender?', expecting a laundry list like "a sniping winger, two quality shut down defenceman, a powerplay quarterback, a great checking center, a top notch first line center and two power forward wingers". Arbour replied by saying "Just get good players. Keep adding good players. You'll end up with a good team."

I've searched the web for an article that relays this story, and this quote from a Steve Simmons' piece in the Toronto Sun captures the essence of it.
Regier grew up playing for Al Arbour with the Islanders, learning the hockey business from Bill Torrey.

"I once asked Al, what's the secret to building a team? He said no secret, it's not complicated, get good players."
Lowe and Tambellini appear to be made from Al Arbour antimatter, and I think that's a bad thing.

Sunday, January 10, 2010

Waiting on a cure.

Lately, I feel like a bad viagra commercial.

That's right, I said what I said. Viagra. Viagra, viagra, viagra. But why?

I feel this way, you see, because lately I've got an awfully bad case of "doing things that are healthy for me". Every time the Oilers have a game I suddenly feel compelled to go for a run... or ask my girlfriend how she's feeling... or heck, even study for one of my 18 1/2 courses this semester. Anything, anything, ANYTHING but watch the Oilers.

How has this happened? How has this happened to a guy whose first hockey memories revolve around a 1990 comeback vs the Jets and a Klima goal in the finals vs the Bruins? A guy who, as a 12 year old, cried when Bill Ranford was traded? Who spent his youth doing everything in his power to mode his playing style into Toronto's high school hockey version of Ryan Smyth?

It's actually quite simple. The Oilers fucking suck. And for the time being, they're very, very good at that. Have you ever seen, in the middle of a Winnipeg (or Edmonton) winter, one of those 18 or 20 year old women who seems to inexplicably insist on going everywhere in a skirt or short shorts? Who somehow gets sluttier when the weather turns frosty despite the obvious health/death implications? Who has one or three Chinese symbol lower back tattoos???

The Edmonton Oilers suck more than she does. A lot more.

The Oilers have no Smyth. No Hemsky. No Khabibulin (which is, clearly, a shock to us all - just ask Mudcrutch.) No testicular or even ovarian fortitude. They have no gumption, spunk, pizazz, or even hutzpah (spit). No je ne sais quois or comment on dit? Just roughly 6/12 forwards who would be strained to make a playoff team's roster and a first and second cap that are both maxed out no matter what the papers tell you.

So here I am: afflicted with going to the gym, reading, spooning, and even occasionally (and terribly briefly) "lovemaking". And it's pretty much all Kevin Lowe's fault.

I wonder if the lady and I should send him a thank you note.