Sunday, August 31, 2008

On Base Percentage

I'm not a big baseball fan any more, but this is interesting stuff. Just from reading the posts of baseball fans around the Oiler blogs, I know that on base percentage (OBP) for teams has a strong correlation to winning. This seems a bit counter intuitive, because the power numbers aren't factored in at all, especially so for someone who started watching baseball when Earl Weaver's Orioles were winning a lot of ball games on the back of three-run homers.

Just looking at overall team stats, adding the hits+walks+hitbypitch numbers, then dividing those by the same surrendered by the team's pitching staff/defense; that gives the OBP ratio. This is the convention used by sabrmetric types, no? And the correlation to winning is enormous, and the correlation to runs ratio (runs scored divided by runs surrendered) equally so. A sample correlation of .90 or so, for any season or portion of it, this is typical. And it is obviously huge.

Now baseball isn't much like trivia craps at all. The better team on the day usually wins. Whereas over 40 matches of trivia craps, I would guess about a quarter of the players will actually have won more money in matches where they did worse at the trivia. Stuff happens.

My question is; how does it predict? If a team starts the year with a disappointing record, but a strong OBP ratio, should we suspect that they've just been a touch unlucky, and that they will start scoring more runs, allowing fewer, and winning more games in the future? And the answer is yes, or at least I can't find a baseball stat that has stronger predictive value, not with a quick search of the web, anyways.

For the 2003, 2005, 2006 and 2007 seasons the predictive value of OBP to the runs ratio, from one half of the season to the other, it averages a correlation of .43. As a point of reference for baseball fans, this is a touch better than the .38 predictive value of runs ratio to future wins ratio over the same interval (which is the essence of Bill James' Pythagorean Expectation) .

I have the even strength face off zones for the NHL for these same seasons, that's why I picked them. And obviously in both sports there are trades at the deadline, injuries, young players that get better with experience, differences in schedule difficulty from one half to the next, teams that are out of the playoff picture so start playing youth in situations that they wouldn't otherwise, etc. And in hockey especially, the bounces are still going to have a big say in the results.

The chart below shows the average repeatability and predictive correlations for these four seasons. I used face off zones because to my mind it is a strong indicator of meaningful possession in hockey, and the even strength zone time ratio and corsi ratio (shots directed at net ratio) aren't available for all of these seasons.



If you sum the first half of these four seasons, and compare it to the sum of the back half of these same four seasons (which I did accidentally at first, due to a script error) then both metrics grow stronger.

In the case of OBP, repeatability climbs to .70, prediction rises to .62.

In the case of face off zone ratio, repeatability climbs to .87, prediction rises to .57.

And in the case of face offs the correlation to immediate results, averaged over these four seasons, is a fairly weak .35 . With the four seasons combined, however, the benefit of the larger sample causes correlation of face off zone ratio to scoring ratio to climb to .71.

I know I'm painting with a big brush here, this is a bit rough. And I'm sure that baseball stats guys have done a tremendous amount of analysis of OBP and other team stats. Still, to my mind it's a comparison that has value. In large part because Pythagorean Expectation and OBP are fairly established metrics in the minds of a lot of the Oiler fans who hang out on the Oilogosphere.

Possession certainly isn't everything, but it's a lot of it. And is the foundation for any sensible endeavour to place reasonably accurate expectations on hockey teams at even strength, and hockey players as well.

As an aside: retrosheet.org is a terrific resource for baseball stats nuts. Everything is laid out for you, it takes minutes to write an excel macro that scrapes off what you need, and for any range of seasons. It does not have pitch counts by inning though, or even pitch counts by game. Does anyone know where this is available? I suspect that the reason OBP is such a strong indicator of present and future success is because it is an indicator of pitch count, and the ability of a team to get weaker pitchers (middle relievers) into the game. Tthough I'm certainly not sure of that.

Wednesday, August 27, 2008

Trivia Craps

This picture from my photo album is going to bring back a lot of memories for a chunk of the Oilogosphere. Pictured from left to right that's me, Pat, Matt, Dennis, Lain and Tyler. Judging by the street and the way that we're dressed, I'd peg it as North Edmonton in about 1978. We're playing trivia craps, like we did almost every warm day back then.

Those were the salad days, we'd buy a stack of sports trivia cards from the five and dime, and then go head-to-head in hour long games of trivia craps. Whoever answered the trivia question first got to roll the dice, and if they rolled an eleven they won a dollar from their opponent. Simple as that. After an hour the game was over and other guys played.

Now Lain was the bomb a sports trivia, still is. When I was squaring off against him I'd always try to drag it out, limit the number of rolls. I'd lollygag when it was my turn to shoot, and otherwise keep him engaged when it was his turn. He'd be about to throw the dice and I'd casually throw in a "Hey, that's a nice tie, Lain. Is it store-bought?", or something similar. And he'd usually bite. This was good because there was no way in hell that I was going to beat him at trivia in the long run, and keeping the number of total dice throws down, just dragging the game out, that gave Lady Luck the best chance of letting me win.

Along that vein, you can see a bit of someone's head behind Dennis there, does anyone else think that's Paul Maurice? You remember Little Paul from the neighbourhood, no?. If so, let me know, because I'm thinking of suing that effer for theft of intellectual property, based on his game planning vs Detroit in the 2002 Stanley Cup Finals. I mean that's pure 'Vic v. Lain at trivia craps' I think, no mistaking it. Maybe the lawyerin' types around here can tell me if I have a case. But I digress.

Now everyone was rolling the same dice here, so it seemed to me that the obvious long term winners were going to be the guys who were good at the trivia; Lain first, then Dennis, and the rest of us about the same, and the rest of the neighbourhood well back of us. The other neighbourhood kids aren't shown here for a reason; the pictured kids already owned all of their cool stuff, such as scooters and yo-yos.

It turns out that we may all have been wrong.

In a recently published article in the prestigious Journal of Embarrassing Hobbies, it's been proven that even over a massive sample, the overwhelming driver of results at trivia craps is 'thinking positively when you roll the dice'. This is peer reviewed research, and carries merit. Forget what you thought you knew about trivia craps, people. The fact is ... it's all in the psychokinetic control of the dice.

In 50 simulated half-seasons of trivia craps, featuring 30 players, each with 41 hour-long games, positive thinking while rolling the dice is the clear and obvious driver of results.

To be completely random, and to account for the "Vic vs Lain" effect, they chose the number of dice rolls to match the shots-directed-at-net for 30 NHL teams, and used an average dice weighting to match, i.e. odds of an 11 being rolled the same as NHL goals/shots-directed-at-net (4.5%).

This is compelling stuff, people. We're talking about dice roll numbers that range from 2,712 to 3,642. That's a whack of rolls, surely to God there is no luck left in that big of a number. And the truth is living in the Positive Thinking folks, to hell with skill at trivia, hell I'm starting to think that it doesn't even exist. Now that I think about it, I must have just imagined it. Damn! I've played, coached and watched a lot of craps, as well. I feel much shame. Bruce tells a great story about a craps player who'd twist his wrist just a little bit before he threw the dice, something the casual observer probably missed, and that's never made more sense to me than now. And now is neither the time nor the place to be talking about repeatability or predictive value, or any of that other mumbo jumbo. Just grab your dice and a bit of money out of your Mom's purse, and meet me at the corner.

Thursday, August 21, 2008

Say it ain't so, Ray

I can't help but keep tabs on the Isles, and there is some good stuff on the Islanders fan boards, though Islander Mania is just too odd since the Islanders themselves became a sponsor.

At HF boards threads such as "This rebuilding plan is going to work this time" are a testimony to the fact that only natural born optimists have remained loyal to the orange and blue. The best threads, however, are the stalkerish ones. The Christie Brinkley, Hillary Duff, Ice Girl, and Player/Owner/GM sightings threads are usually a good read.

There is a thread in there from last winter discussing the Carolina Storm Squad (the Hurricane's ice girls), mostly comparing them to the NYI Ice Girls iirc. A regular poster chimes in to point out one of the Storm Squad girls from the photo array and inform everyone that he hooked up with her on New Year's Eve a few years ago. Described it as one of the proudest moments of his life. Damn, that's wildly inappropriate. And it's pure internet gold, people.

Anyhow, right now there is a thread on the front page titled Best/Worst guys you have met from the Islanders, obviously it's a must-read. And while the commentary is overwhelmingly positive on the whole, it turns out that almost nobody has anything nice to say about Ray Ferraro.

Isle Junkie:
Ray Ferraro was the absolute worst. I've never seen a guy so upset to spend a few minutes signing autographs for the fans. I was probably 13 and I asked him to sign my jersey & he tsked and grabbed it out of my hands signed it & then threw it back at me. What a jerk.

Andrea5174:
Ray Ferraro (could be THE worst, actually saw him scream at a little boy, couldn't have been more than 6 for accidentally getting marker on his hand)

Ploplopfizzfizz:
There are definitely some total tools out there "chime in" Chicken parm Ferraro. He is just a rude unhappy little man.My nephew asked him for a autograph after a Isles win and he walked around him and kept walking.

Strummergas:
My friends and I used to wait afterwards for autographs and I got just about every player from those years. Never had a problem with anyone except 2.
Ray Ferraro and Vladimir Malakhov.
Ferraro was so pissed that we was signing autographs, just complaining the whole time. If he hated it that much I don't know why he didn't just refuse as opposed to *****ing the entire time.

Yeah Buddy:
Worst
Dipiahole thats what he is and Ray Ferraro.
Jason Blake is giving our man Ray a real run for his money in the superdink category, no small feat considering that Todd Bertuzzi, Mike Peca and Billy Smith played for this team.

Actually, besides Czerkawski and Guerin, nobody has anything nice to say about any of the former Oilers that played on that squad, and there have been a bunch.

Friday, August 15, 2008

Defensemen and Opportunity

So much of a defenseman's offensive stats come from PP time, and their even strength points and +/- (and the underlying numbers that drive it) seem to be mostly of a reflection of the quality of opposition and context of their ice time. These are tough players to judge by the numbers.

Since the context is obviously very important to any skater, I thought I'd look at the number of extra faceoffs each Oiler defenseman took in his own end of the rink, over and above the ones in the good end. I think it is a very good measure of the toughness of ice time.

So, by way of example, Staios was on the ice for a whopping 117 more defensive zone faceoffs than ones in the offensive end of the rink. Greene was at the opposite end of the spectrum, on the ice for 14 more draws in the offensive end of the rink than the home end.

The pattern is obvious I think. More own zone draws, without offensive zone ones to balance them out ... that leads to worse numbers.

The sample correlation here between faceoff zone +/- and corsi +/- is very strong, 0.91. The only other team I checked was the Calgary Flames, and the pattern there seemed obvious to me as well, the correlation being .81 in that case. So I doubt that this is unique to the Oilers, I can't see where it would be different except possibly on teams with a wild disparity in talent levels between their blueliners.

So really the only guys who seem to be separating themselves from the pack at all here are Staios (good, this context considered) and Greene (bad, this context considered). With good marks to Pitkanen and Tarnstrom as well.

And Pitkanen and Gilbert may go on to be the type of player that create a few extra goals at evens every year as well, time will tell. Tarnstrom has never been able to repeat his big offensive year in PIT, but he's still been useful.

BTW: You can get the faceoff data here, and the corsi data (amongst other things) here. They take about a minute to load, and you can change the team abbreviation in the URL to check for other squads.

Great Hockey Stuff on the Web. Part II


A little while ago I clicked on the name of a commenter here, a quiet and anonymous compliment of his sanity. It turns out that dude has/had an Oilers blog, one that nobody seemed to know about. The Gospel of Hockey is/was the bomb, people.

Some great Biblical oilerpretation, both New Testament ...
Oilers v. Avalanche - Matthew 26:20-25

When evening came, Kevin, the prophet, was reclining at the table with the Ten. While they were eating, he said, "I tell you the truth, one of you will betray me." The Ten were greatly distressed and began to say to him one after the other, "Surely not I." Kevin replied, "The one who has dipped his hand into the bowl with me will betray me. The great prophet of the EIG will go just as it is written about him. But woe to that man who betrays the EIG! It would be better for him if he had not been born." Then Ryan, the one who would betray him, said, "Surely not I, O prophet?" Kevin answered, "Yes, it is you."
... and Old Testament ...
Oilers v. Wild - Leviticus 5:3

If a person loses to any ceremonially unclean team—whether it be the Wild or the Ducks or the Predators who creep along the ice—even when he is overmatched, he has become unclean and is guilty.
... mixed with mathy stuff like an analysis of RFA offer sheets and a dis of IOF-pimped AHLer Jeff Tambellini that has me rethinking both.

Unfortunately Scott stopped posting shortly after he started. Did anyone else know about this place? Let's hope he fires it up again when hockey starts.

The Trap and the Trapezoid

Tom Lynn's rant about the trap, referenced below, has an interesting note in it. It has little to do with the trap, but instead talks of defensive zone play.
The larger ice surface scares coaches from letting their charges wander far; the players are instructed to bunch up in front of the same 60x45x20x45 trapezoid in front of the net that goals in any hockey game are scored from
Does that seem huge to anyone else? I've marked up a rink drawing below to reflect that. Jebus, that's a lot of ice to cover. And there are a whack of guys in the NHL who can score from outside the dots at the side down low.
We've looked at where shots go in from before, granted just with the very dodgy data from cbssportsline.com. And this little picture is starting to look like a high school gym floor at this point, with all the lines on it, but here we go, with the trapezoid added.

I may be chasing a red herring here. I mean defensive zone coverage has nothing to do with the neutral zone trap, and similar boring defensive schemes, that Lynn is defending here. So perhaps he brought up the trapezoid hoping that some of his readers would infer (wrongly) that this is where trapping hockey comes from. I dunno.

Great Hockey Stuff on the Web. Part I

The Official Minnesota Wild Blog is terrific.

All the writers are entertaining, but Doug Risebrough especially. Some insight into the life, like this ...
My oldest daughter, Allison, never really liked hockey. But, about the time she was 13, when I was playing for the Flames, she became more curious about the scores, more disappointed when the team lost. My wife and I thought she was becoming more interested in the game. She actually was preparing herself for her school day. If we lost, she was going to have a harder day at school. The same goes for a player whose name is rumored in potential trades.
And into the hockey business side, like this ...
In our evaluations of the 30 teams we project 39 openings league-wide for top-6 forwards or top-3 defensemen. But only 18 players meet the description. The demand far outweighs the supply, which should guarantee that most, if not all, of those 18 players sign contracts for more money than they merit. This lack of top-end supply is why so few players have signed extensions with their own teams the last couple of months and why some teams have been willing to give up a draft pick or two for the right to negotiate with another team's top UFA player.
Assistant GM Tom Lynn has written several informative and entertaining blog entries. This essay on RFA offer sheets will be of interest to a lot of the folks around here. His latest post defends the trap, and takes some hilariously bitchy swipes at others:
Myth #2: The “Trap” was created by Jacques Lemaire in the mid-1990s to stifle offense from either team and allow weaker teams to beat more skilled ones

Like the old cliché, this myth needs no introduction. Media and message boards connected with the Wild’s opponents have whipped this one up like the Red Scare of the 1950s. It even has some high priests among a cell of the Twin Cities media who need MapQuest to find downtown St. Paul.
As I say, dude can be pissy. Funny stuff though.
... These words were from Carl Brewer, referring to the hiring of Punch Imlach as coach of the Toronto Maple Leafs in 1957! The Leafs rode this detested defense to three straight Stanley Cups from 1962 to 1964. Unknown to its local media, the “Golden Age” of the Leafs was not all flying pucks and 7-6 scores. The only thing more awkward in Toronto may be that a team owned by the Province’s teachers is spelled incorrectly.
And though there are so many good entries that it's tough to pick just one as a must read, this is a belter. Rare insight into the summer planning of a NHL hockey operations department.

I'm jealous of Minnesota Wild fans right now, and I highly recommend reading through all of the blog posts there.

Thursday, August 14, 2008

Reasons for Optimism: Part I

Sometimes steady progress gets lost in the hot and cold streaks. Below is the rolling average of the Oilers ability to end their even strength shifts in the offensive end of the rink, as opposed to the bad end.

That's in blue with a copper trend line. The same thing for the Flames is shown in yellow and red.


Where EV Shifts Ended. 20 Game Rolling Average.

Tuesday, August 12, 2008

Home Ice Advantage

We all know that teams have a better chance of winning at home than on the road, it's been that way since Bob Cole wore short pants. The question is: Why?

I've always believed that with hockey, it is more about rest and preparation time than anything else. Which is why I think Lowe had a valid point a few years ago when he criticized the Oilers business operations people for scheduling a whack of back-to-back games on home ice. The Oilers were trying to cash in on the weekends, and the higher concession sales and increase in out-of-towners that they bring. Fair enough, but when the Oilers miss the playoffs by a point or two, as they had that season, it seems less clever.

So a look at why home ice teams did better last season. This is just for even strength play for now, and doesn't include empty-net or shootout stuff. Hopefully somebody digs up the PP stuff one day and sorts through it. Any road, here you go, just click on the image to enlarge it:

I'll leave you to do your own arithmetic, but of the +198 in goals by the home teams, I make +45 as a result of better shooting%/save%, and the remaining +153 in goals from having more shots on net. And the underlying numbers, that ones that point towards puck possession, they are strong in about the right measure.

Obviously teams that have the pill more will usually end up getting more chances on the PP, even with Ethan Moreau's random head punching considered. Still, that seems a little steep to me, maybe the refs are getting swayed by the home crowd a bit. If you know of a site that shows 50n3 PP opportunities, or ice time, in home and away breakdowns, please leave a link in the comments. It wouldn't surprise me if the home squad is getting the benefit of those as well.

There are surely other things at play that haven't crossed my mind yet, and it's just one 1230 game season that I've looked at here. Still, a starting point.

Monday, August 11, 2008

Zone Time, Corsi and Correlation to Winning

The title and the graph pretty much say it all. This is just another representation of the data for the 2001/2002 season that was used in the post below.

Even though Zone Time and Corsi tell a bit of a different tale for different teams, as shown in the table in the previous post, both are clearly related to each other. And as you start adding more and more games to the pile, they more strongly relate to team goal differential (goals-for minus goals-against).

If we had faceoff zones for this season, it would follow along almost exactly the same in the chart below. And all three would be closely related, and it would require someone more clever than me with math to prise them apart.

The left axis is correlation squared, or coefficient of determination, as they tell me at Wiki. Since these things are largely independent of shooting% and save%, that's probably not a bad first conservative guess at the contribution to goal differential that possession gives us.


And we haven't really looked at finding ways to build a better possession metric than Corsi by rationally combining in faceoff zones in some way. Offside zones would be gold, because they are obviously very independent of Corsi or faceoff zones, so the math would be right in my wheelhouse (addition).

This number wouldn't go down with additional information, or a means of combining the stuff we do have, it would go up. And looking at the graph, for this season at least, it doesn't look like it's topped out yet. i.e. I suspect that if the season had been 40 games longer we'd be looking at stronger ties between goal diff and Corsi and Zone Time.

Mike Babcock's point is seeming less overstated. Hell just difference in goaltending has to take a big bite out of whatever is left of this apple.

Note * I've shaved off the last four games of the season for everybody here, because the missing data wasn't spread around evenly amongst the teams. I'm just going as far as 78 GP for everyone in the graph above.

Sunday, August 10, 2008

Zone Time

From 1999/2000 through the 2001/2002 seasons, the NHL recorded zone times and published them on their game sheets. I don't know why the NHL stopped making this information publicly available, then again I don't understand a lot of the things that they do, I suppose they have their reasons.



Zone time and possession are terms that are often used interchangeably by hockey coaches, and I think we all realize why; having a bit more than your share of touches in the neutral ice pachinko game doesn't represent much of an advantage. But having meaningful possession moving forward, and in the offensive zone, that often ends well for you. When Mike Babcock says "Possession is everything." we know that it's hyperbole, and we know that he's not talking about one of his defencemen standing behind his own net with the puck.

Unfortunately the NHL never broke this down by game situation, so the PP stuff is in there too. Still, it's better than nothing.

I thought it would be worth looking at a season to see how shots, and shots directed at net, meshed with the zone times for the teams. Now obviously some teams play more of possession game than others, and unbalanced scheduling means that some teams played this type of opponent more than others. Still, the connection between shots and zone time will still shine through, as will the link between the ability to outscore opponents and these two things.

So I wrote a simple little Excel macro to scrape the zone time info off of the NHL.com game sheets for 2001/2002, and the shots stuff from the NHL.com event sheets. Laziness prevented me from filtering out the empty net goals, and the data is missing for 21 of the 1230 games, c'est la vie.

The results for the teams are listed in the table below. By way of example, for Atlanta the puck spent 355 more minutes in their own end of the rink than in the good end. They were outscored by 101 goals, outshot by a margin of 927 shots, adding in the missed shots they were beat to the tune of 1276, and for all shots directed at net they were outdone by 1532. And witness the Oilers reaching the very heights of mediocrity.



It's self explanatory, and as always, readers are encouraged to add common sense. A couple unusual seasons for teams in there, but for the most part it rolls out as you would expect. Clearly Babcock has a point, better teams, by and large, have better zone time, though it is a surprisingly small total over the course of a season, even for the terrific teams. Having said that, if I were to make a sweeping statement I probably would go with "Goaltending is everything."

And to try and pin values on it, the Pearson correlation coefficient between zone time and the three different shots metrics:

Shots +/-: .86
Fenwick +/-: .87
Corsi +/-: .90

Much of a muchness between the three really, this is a larger sample than a guy is usually looking at, though. And just generally the Dallas and Colorado seasons look like quirky ones this year, and if I gave even half a damn about either team, I'd probably have a closer look.


Update: In response to a comment by Traktor, the table has been changed to reflect which teams made the playoffs. The playoff team names are highlighted in light green.

Saturday, August 02, 2008

Shooting percentages, Gerry Meehan and that other kind of promotion

Shooting percentage is a cool thing. It turns out that if you look at the NHL leader board for this stat, and scroll back through the years, you hit a bunch of great stories. This is the stuff that out-of-nowhere seasons are made of.

In 87/88 Mikko Makela of the New York Islanders finishes 3rd in the league at shooting%, surrounded on the leaderboard mostly by excellent players having good years (Simpson, Stastny, Loob, Nieuwendyk, Verbeek, Robitaille) as well as guys who played a lot with Gretzky or Hawerchuk. For the life of me, I can't remember this player. His stats show that he had a couple of decent offensive seasons prior to this one, but 87/88 saw a big jump, what the kids call a breakout year, 36 goals and 76 points coming largely on the back of shooting percentage. They went in for Mikko that year, but never again.

Then in 89/90 Lou Franceschetti, a 31 year old journeyman winger in his first year with the Leafs, he leads the entire league in shooting percentage (70+ games played) during a 21 goal career year, and this without a single PP goal to his credit. He would score only two more goals in the NHL.

These two guys have something else in common, they were acquired by the Buffalo Sabres in 90/91 and, quite predictably, didn't live up to expectations. This would be Makela's last year in the NHL save an 11 game comeback attempt with the Bruins in 94/95. Franceschetti would play just one more NHL game, this for the Sabres in 91/92. Which begs the question - Who the hell was the GM of Buffalo in 1990?

The correct answer is Gerry Meehan, shown below in his playing days as the young captain of the Buffalo Sabres.

This photo has been knicked from gerrymeehan.blogspot.com. Seriously, I'm not making this up, it is a surprisingly cool site. About the blog author, Mark:
I am a lifelong Washington Capitals fan, or at least lifelong beginning in the late 1970s, when Gerry Meehan and his family moved next door to mine in Bowie, Md. That started years of playing hockey, collecting hockey cards, and amassing probably the most comprehensive collection of Gerry Meehan memorabilia anywhere.
By all accounts I've stumbled across on the web, Gerry Meehan was an honest player and is a good guy. After retiring he went back to school and earned his law degree from University at Buffalo School of Law, started working on contracts for then-Buffalo-GM Scotty Bowman, and would eventually succeed him as GM in 1986.

During his seven years as GM the Sabres were largely mediocre, and managed only one playoff series win. Meehan's best trade, and it was a belter; Hasek for Stephane Beauregard and a fourth-round draft pick. Unfortunately for Meehan, Hasek wouldn't get the starting job until the following season, when he was no longer GM.

Why was he no longer GM? Because he had been promoted, of course. From Wikipedia:
In 1993, Meehan was named executive vice president of sports operations, taking a more active role in the organization's business and legal affairs.

He resigned his position in December 1994.
That's it, just a stream of consciousness. I don't really have a point, other than maybe 'the internet is groovy'. Carry on as you were.