Wednesday, June 16, 2010
Sunday, June 13, 2010
Excerpts and Guesses
'Everything is a blur out there,' McCauley said almost drunkenly. He hold me the concussion wasn't the first of his young life and that he wasn't sure how many times he'd had his bell rung. With McCauley that night, and Couture this evening about a decade later, I figure it's a scout's dilemma: Do you give them points for bravery, or subtract a few from recklessness?"
-excerpt from Future Greats and Heartbreaks by Gaye Joyce
While there are a number of current Oilers for whom this quote would make a nice introduction, I like most Oiler fans am thinking about the draft more than anything right now and hence my thoughts surround Taylor Hall and Tyler Seguin. There are a lot of arguments I'm content to ignore and I honestly feel most of what we say as fans about prospects is just noise - if ever a scenario existed that perfectly exploited human psychology to polarize people on issues they haven't a fucking clue about, I think talking about 17 and 18 year old hockey players blew it out of the water.
Lowetide has mentioned injury time and time again as one of the biggest possible reasons for failure to launch as an NHL player. From Brett Lindros to Doug Lynch, I've seen enough good players fall by the wayside in my own lifetime to be more wary of injury than any other factor in a player's development. Even Brett's absurdly gifted brother Eric was eventually forced to play as a shell of himself after being one of the most impressive players in the NHL for most of a decade.
So, a simple question: IF Hall is the better player, how many healthy seasons justify taking him #1 overall? I'm not saying he will get hurt and that Tyler Seguin won't get hurt. I'm just looking for a line in the sand.
Lastly, a hat tip to Vic for your recent flurry. I seem to have a lot of reading to catch up on!
Saturday, June 05, 2010
Bill James' Pythagorean Expectation and King Values
Long ago baseball writer Bill James asserted that the ratio of runs scored to runs allowed was a better indicator of a team's true ability than it's win-loss record. The Pythagorean winning percentage is defined as = 1/( 1 + (RA/RS)2 ). While it is far from perfect, it is a clearly a better indicator of most teams' ability to win than their win-loss record, so I'll use it here.
I downloaded the 2009 MLB standings from the Internet, and calculated the Pythagorean expectation for each team. Then I calculated the King Expectation for each team using very nearly the same method demonstrated here. Easy as beans. This result, the King Expectations, are the revised Pythagorean Expectations, ones that are corrected for difficulty of competition. And it's stunning, or at least I think so.
These numbers reflect the expected change in winning percentage of each team, due to their schedule. By way of example, the Toronto Blue Jays would be expected to win about 12 more games had they played the St. Louis Cardinals schedule last season. That's a whack, folks. In other terms, they would have been the best team in the National League by Pythagorean Expectation, very narrowly edging out Philly. The difference between some conferences is large, but the differences between the two leagues is absurd.
A quick check of the results:
The difference in King Expectation from NL to AL is 5.4%, the AL averages 52.9% and the NL averages 47.5%. Therefore we'd expect the AL to win about 55.4% of the inter-league games. They in fact went 137W-114L in 2009, good for a 54.6% winning clip. We'll call that close enough.
UPDATE: I keep forgetting that Milwaukee is in the NL, so I've changed the last paragraph a bit. i.e. I've edited the numbers to reflect this, and the AL is a tad better than in my original post, and the NL just a smidgen worse.
Wednesday, June 02, 2010
Throw A Few Pennies On The Weight
I remember a scene from a film I saw as a boy. A brash American character was walking alongside a posh Englishman through the West End of London. They were probably solving a crime, I don't recall. I do remember the American looking up at Big Ben, then checking his watch and informing his colleague that the tower clock was running a couple of minutes slow. The Englishman quipped: “I'll be sure to tell the lads to throw a couple of pennies on the weight”. I don't know why I remember that, though I'm pretty sure that shit like this is occupying the head space that I would otherwise have used to remember anniversaries.
Do pennies on a clock weight really influence the clock speed? I don't know. I do, however, think that it is a good analogy for hockey and scoring chances. If keeping fast time is a good thing; throwing Visnovsky over the boards is like adding a bunch of pennies onto the weight, throwing Moreau over the boards a few seconds later is like taking most of them back off again. And the question becomes; how many pennies are Visnovsky and Moreau actually worth?
Consider the following information, gathered from a few games in an imaginary league where the quality of competition and linemates, just generally the context of the ice time, is the same for everyone. There are also only four defensemen per team in this league (it's my imaginary league, I can do what the hell I want). How many pennies are each of these four defenders adding or subtracting from the weight? You can click on the image to enlarge it.
What if Dennis King, going purely by instinct, told you that Smid was worth -3.6 pence per shift, Vis worth +6.3 pence, Souray +4.5 pence and Struds -8.6 pence? Would you believe him?
Personally I would tend to think he's probably right. After all, every game that someone tracks scoring chances makes them colder and more rational, and Dennis has recorded more than most people around here. Then I would remember Springsteen's sage advice “Blind faith in anything will get you killed”. And while I'm not sure how I could die from this, short of provoking Dennis to the point that he tracked me down and murdered me ... they are still wise words. I'd bust out some simple math to check on Dennis' assertion.
Using Smid first:
He obviously played all 100 shifts with himself on the ice, so -.036 x 100 = -3.6
He played 75 shifts with Vis, so .063 x 75 = 4.73
He played 15 shifts with Souray, so .045 x 15 = .68
He played 10 shifts with Struds, so -.086 x 10 = -.86
Add those up and it's predicting a scoring chance +/- of +1 for Smid. And that's what he got. It works for everyone else as well, so Dennis is probably a witch.
We'll call these King Values. It would be better if Dennis had a less popular surname, such as Clutterbuck or Schultehammer, but it's still a decent name for the statistic.
And just so you know that I'm not blowing smoke up your ass, you can use this link to check for yourself. It takes the shifts-together information from above, then calculates the King Value from the scoring chance +/- data that you input in the URL.
The default URL linked above is http://timeonice.com/king.php?smid=1&vis=4&souray=-1&struds=-5. The red numbers that I've shown here are the scoring chance +/-s that I used in my example. You can change those to anything you'd like and rerun the script. You will, of course, get a new batch of King Values.
On the output, which looks like this:
The initial guess is emboldened, and is +/- in pounds sterling (so vis's 0.04 is £0.04, or 4 pence), that's the starting point. Each row of data below that represents the next iteration. In short, we took what we learned from trying the emboldened numbers as King Values, saw that it didn't give expected scoring chance +/-'s that were worth a damn, modified them rationally to try and get a better result with the next try, then had another go. Ad infinitum. Or in this case, ad ten. Of course if you enter something absurd for player scoring chance +/-s the whole thing will become unstable and output nonsense.
For the mathematically inclined: A quicker way of doing the checking math: If you think of the shift-together chart as matrix A, the King Values as matrix B, and the recorded scoring chance +/-s as matrix C, then A times B using =mmult() in a spreadsheet ... that should equal C.
I'm not intending this post as a mathematical exercise. My goal here is to forward the general way of thinking, and also to open the floor to considered criticism, this using an example that's still on a small enough scale that spoken languages are relevant in helping us comprehend the universe. Because what comes next with this line of thinking, whether I choose to use MLB or NHL data, is going to make the world seem simpler than even the squarest of heads could have ever imagined.