This is at the crux of sport's most heated (and entertaining) online debates. The Internet is good that way, because surely we all have the good sense to avoid arguing about these things in real life. It also separates two types of thinkers, and they will never cross the line to join forces with the folks who have the different mindset, so it is an endless repetition of similar arguments.
Likelihood, and the way we understand it. That's at the core of almost all of it. So I thought I would write a blurb on the subject, as a point of reference for the future.
Think of an imaginary hockey player that scores 16 EV goals on 103 EV shots, this during his rookie season. That's a 15.5% clip. How good is he at finishing his chances?Thinker A says:
He's a 15.5% shooter at evens, that is precisely what the evidence tells us. And with finish like that, he should rightly get more powerplay time and EV icetime in his sophomore year, meaning more goals.
Now Thinker A doesn't expect him to shoot EXACTLY 15.5% next year. Hell, a goalpost or two here, a big save there, a soft goal allowed by the opposing goalie here, it can make a hell of a difference ... it's a fluid thing. But 15.5% is the best guess.
A university student, in the natural or social sciences usually, will often bust into the Internet conversation with some math to support his Thinker A cohort. He will use precisely the thinking, and equation, explained here
to calculate the likelihood of this rookie getting this wonderful shooting% while only possessing average shooting ability. Then he'll do the same for a player with 24.5% natural ability, and 23.5% natural ability ... and so on and so on. Plot them all out and the likelihood of the player's abilities will be expressed in the red histogram (ignore the clear one for now):
That's the graphic representation of what Thinker A was feeling all along, so he embraces it. And really, it is intellectually honest. He's looking at the player in isolation. He'll also find ways to rationalize it. He's a brilliant shooter, he doesn't waste any, he always gets in close for his chances. Some of that may be true, in my experience most of these arguments are developed after the fact. I doubt that Thinker A types are compulsive liars, they just embrace the anecdotal evidence that supports the way their brain works, as visualized in the chart above.
Now Thinker B enters the conversation ... this is always trouble.
He reads the discussion and can't remember the player having these special qualities. Then he looks and sees that the rookie never scored at will in either junior or college.
Then he looks at how shooting% shook out for the forwards in the entire league last season. It's a pretty wide spread of results (about half agains as wide as the clear histogram in the plot above).
He also sees that a whack of guys shoot 15-16% at evens every year, but nobody has ever maintained that clip for their career (the clear histogram in the plot above is his sense of how career EV shooting% is shaking out). And that includes some guys with tremendous, proven finishing ability. It's a real oddity for someone to even manage it two years in a row.
That's his feel of the way EV shooting% works in the NHL, with some guys being clearly better than others in the long haul, but wild swings for most guys from year to year, and huge spreads from good to poor on a yearly basis ... he surmises that buddy is due for a fall, that he just had a bit of a lucky season with the goals.
So Thinker B looks at the Thinker A's position (the red histogram in the top plot) and says ..."Okay, I can understand that to an extent, but you're saying that this rookie has 1 in 6 chance of being a 15-16% guy by ability, and a 1 in 12 chance of being a 11-12% guy ... twice as likely to be in the high group, that seems fishy to me."
"In the league, we have a pretty good idea that there are 20 times as many 11-12% forwards than there are 15-16% guys, in terms of long term ability ... shouldn't we take that into account? So instead of our rookie being twice as likely to be in that 'super' pail than the merely 'decent' pail ... isn't he actually 10 times LESS likely to be in the 'super' pail than the merely 'decent' pail?"
This is usually the point where the chasm between the two mindsets becomes obvious and unbridgeable. Name calling generally starts about here as well.
If thinker B progresses with this thinking, multiplies each red column with each clear column, he'll end up with the bold clear histogram in the plot below. That's his likelihood estimation of the players ability. And it's a hell of a lot different than thinker A's.
If you use B's histogram above, as his estimate for the rookie ... now do precisely the same for the other 250 or so forwards who registered enough shots for you to count. Add up all your histograms to build one giant one (feel free to use Lego) ... you're back to where you started with the original clear histogram in the first picture. At least you will if that clear histogram was right in the first place.
Now, let's simulate a season. If you take one lego block, at random, from the histogram for each player (let's say you pulled a 10% lego block for the rookie) now grab a die weighted to 10% and roll it 103 times (the number of shots he had). Make sure to record the number of sixes you roll, mark that down as his simulated goals for that season, and figure out his shooting% as well. Now do the same for every other player ... plot out the results. Voila! The same as the spread of results for an actual NHL season.
Simulate another season and look at the way the individual player's numbers shifted from yearto year. Some guys had madass swings, some guys stayed the same. And the real league numbers show a frighteningly similar pattern. The real world numbers will be ever so slightly wider shifts from year to year. Why? Because things like wrist injuries, playing with Thornton, emotional problems ... they really do happen, and they affect a player's EV shooting%. But the effect on the league as a whole ... it is a fraction of what hockey pundits are attributing to the variance for even one team. So the overwhelming majority of this expert insight must be completely untrue. The universe demands that.
If thinker A tries the same thing with his plot (the red one, his sense of likelihood) if he applies it to every player in the league and adds them all up ... the spread is way too wide. He has estimated tonnes more terrific shooters than the NHL produced, and tonnes more terrible shooters as well. Keep parlaying it through and in relatively few seasons you'd have a population with a bunch of guys who didn't have a hope in hell of ever scoring a goal, and a smaller group who scored on almost every shot they took.
That's why A scoffs at the notion of predictive value, and will almost never wager with B.
Now obviously very few arguments on the Internet involve that kind of math, the language used is more likely a mixture of English and math, mostly the former. The format remains the same though, regardless of whether the topic is clutchness or health or save percentage or shootouts or whatever. If google had an English2Math translator ... you'd see that a lot of guys who aren't especially mathy (slipper and Tyler come to mind) are actually often hitting us with some big, Bayesian concepts. Translated to math, they would be brutal to solve, you'd need someone with much better math skills than me.
B types often bust out the phrase "regress to the mean". That's probably a poor choice of words. It creates the impression that there is an invisible force in the universe pulling everyone and everything towards mediocrity. Really we are talking about luck driving results in the short term, but the bounces being more likely to settle out with time. There is a good chance that a forward who has completely average results over three seasons ... if he has a hard shot and quick release, he will probably get better results in the future, he will probably move AWAY from the mean. The population as a whole is regressing to the mean with time, the players are bouncing around every which way at any moment.
That is it. I don't think I have ever seen a Thinker A type convert to a Thinker B type in my life. The reverse has never happened either. Not in sports talk, not anywhere. I have to think that it's just the way our heads are wired from birth or early life. The Oilogosphere is swimming in this Thinker B type of writer and commenter, whether they are mathematically inclined or not. That is an extraordinary thing, at least to my mind.NOTE: The clear histogram in the top picture is actually an estimate of the ability of finishing in the population. You can think of each piece of it as the average of each player over a million parallel universes, it all adds up to make this plot of shooting% ability for the NHL forwards. 'Ability' is the common term for these prior distributions in sports, 'Non-Luck' is more correct to my mind.