Wednesday, March 10, 2010

Math Atheist

~
Sometimes I get jealous of the blogging community and not just because 'Lowetide' draws visitors like Disneyland or 'Oilers Nation' reminds me daily that I have zero Photoshop skills. I get jealous of the intelligence of the various QWERTY mashers out there, specifically those in the "Oilogosphere". I'm impressed by the guys who have the ability to navigate the fine print of the NHL CBA and those who have mastered the art of stats analysis for use in their blog posts.

Speaking strictly for me, I understand that there is value, probably great value, in all the stats work that's done but it doesn't hold my attention at all. I'm not trying to say that David Staples' "error stat" isn't worth the time he spends with it but just that between that new stat and "Corsi numbers" and "Qual Comps" and whatever else is out there... I quickly get headaches.

So like Calvin, I'm declaring myself to be a math atheist.

I also don't like tomatoes, mayonnaise, or sushi but most people I know love all of the above. Mom would always make me try something before she'd let me declare that I didn't like it and that makes sense. I've tried all three of those items as an adult and thanks but I stand by my childhood opinion - I don't like them.

When it comes to stats and math, or the advanced math that gets tossed around so eloquently in the "Oilogosphere", part of me is intrigued even though another side has already decided - I don't like math. I've tried to wrap my head around the theories but for the most part, they put me to sleep. Again, that's my flaw and not a comment on the value of stats.

By now you're wondering what the hell I'm rambling on about. Last night (or early this morning) I noticed a small entry over at The Copper & Blue - (those guys do a ton of great work BTW).

My first reaction was to wonder if Taylor Chorney has become the new whipping boy for the stats-based bloggers. Last night I heard TEAM 1260's Corey Graham and Wil Frazier talking about Chorney's poor play in the game against Ottawa and without having seen the game myself (I was on the air at the time) I have no reason to doubt their opinions as anything less than accurate.

So I re-read Jonathan's snippet on Chorney's struggles at the AHL level. Something didn't feel right about it aside from how ugly that -51 looks on paper. I think it's the way that J.W. projected the numbers to a full 82-game schedule. I'm not a stats guy so I don't know but... is that fair to do? To assume that what a guy did in 32 games is what he would have done in 82? Like mayo, that doesn't work for me.

Using that example, is it then fair to suggest that if Devan Dubnyk played 40, 50 or hell, all 82 games this year... he would still have zero victories? Ales Hemsky had 22 points in 22 games when his season ended, should we believe that he would have been a 82-point man if he'd played the entire year?

Maybe I'm being simplistic or naive but that seems flawed to me so, like a tuna fish sandwich... I passed on it.

One of the things Corey and Wil (and a caller) were discussing was how bad Chorney's plus/minus has been everywhere he goes. It's doesn't matter if it was last year or this year, Edmonton or Springfield, Chorney's rating reads like a thermometer in January; -20, -11, -29. But was it always like that?

I checked his stats from North Dakota where he played for 3 years before turning pro. In 2005-06 he ended the year with a +21. He followed that up with a +2 and had a +8 as a junior. Three consecutive seasons with the Fighting Sioux where he was a combined +31.

But in fairness, was that a fair way to assess him? I mean, he was playing on a pretty darn good team those years, especially in 2005-06 where UND was good enough to contend for the national title.

But maybe that's the point. On a good team he was a plus player but on the worst team in the AHL (for two years now) and the last place team in the NHL his plus/minus has been at the opposite end of the spectrum. Should I have expected anything other than that? Is it too simple to say that Chorney's numbers as a pro are in large part a reflection of how dreadfully awful the three teams he's played for have been?

I'm probably missing something and if so, this is an open invite for someone to educate me. Just make sure you dumb it down for me using small, mono-syllabic words.

And hold the mayo.

Jonathan does a superb job with his blog at The Score so if you're not a regular reader like we are, do yourself a favor and check it out.

(Photos: Springfield Falcons and Kory Wallen/FightingSioux.com)

Guy Flaming is one of the hosts of The Pipeline Show heard live every Tuesday evening from 7-9pm (MST) 9-11pm (EST) at www.thepipelineshow.com and the TEAM 1260. Flaming has covered prospects for the last decade and has contributed written projects to The Hockey News, Hockey's Future and Future Considerations among others. Guy's broadcast experience includes colour commentary for the Edmonton Oil Kings, University of Alberta Golden Bears as well as the 2004 and 2006 Viking Cup tournaments.

17 comments:

Eetu Huisman said...

The problem with ± is that it - like most numbers - is not valuable without a context. It has some value when compared to other players of the same position in the same team and even more value when the quality of a) teammates the player is on the ice with and b) opponents the player faces are taken into account.

(I don't want to say anything about Chorney because I've never seen him play. I think you should take time to read the FAQ about Statistical Analysis in the NHL by Gabriel Desjardins. It might make you a believer, or at least an agnostic :-)

Guy Flaming said...

Relative +/- adjusts a player's on-ice +/- relative to his team's +/- while he was off the ice.

Thanks for the link but I'll have to try reading it again tomorrow after I get some sleep. I got as far as the line above and felt something pop behind my eyebrow. It may have been an synapse trying to fire.

:)

GSC said...

Guy, great article.

One of the biggest issues I have with the math bloggers' "analysis" is that their inherent bias tends to show up, despite their best efforts to mask it.

You provided the Chorney example, showing that a certain blogger used a 32-game sampling to predict how his season would turn out. Now, if someone did that for Shawn Horcoff this season (as many did), the bloggers would be up in arms (they were) suggesting that the sampling size is "too small."

I don't care what method of reasoning or theory one uses, there's always going to be bias no matter how one tries to hide it. Statistics are so easily manipulated that it allows for such a practice.

dstaples said...

Good post, Guy.

And don't worry, you'll come around on the error stat and true plus/minus soon enough ;)

You see, it's not math.

It's the plays where a guy helps score a goal for his team compared to the plays where he helps cause a goal against.

It's so profoundly simple-minded even I could think it up.

dstaples said...

You know, I've spent many, many hours -- way too many -- trying to figure out what is good and bad about all the new stats.
A big part of the problem is that the math guys -- as brilliant as they are -- can get pretty haughty and testy when you ask them to explain themselves.

Of course, if you come up with some new concept, I see it as your job to explain it, but that's not the consensus, apparently.

In any case, over time some of the new stats will be accepted, others won't.

For now, if you're watching the game and are an expert fan and are paying close attention, I'm sure your take on the Oilers is as valuable as any stats guy.

Guy Flaming said...

Thanks David, for not taking offence.

Here's another thought: After 35 games, what would Dustin Penner's performance have projected to end up as for 82 games? For me, current events tells me that it's difficult to do that with enough accuracy to be a reliable method.

But I have no problem at all with being wrong on this.

godot10 said...

The main problem that I see with stats guys is that they only do half the work.

They use stats to justify opinions. They don't use stats to prove anything, which would be to do the other half of the work and demonstrate the statistical significance of the statistics they gnerate.

A statistic is "meaningless" unless you stick an error bar on the result. It really isn't math or statistics without the error bar, it is jsut statistical diletantes messing around.

So they do the easy part, which gives them talking points, but they don't do the hard work, which would actually prove something.

i.e. They generate hypotheses (based on statistical observations, but then never prove the hypotheses by doing rigorous statistical signficance analysis of their statistical observations.

I do believe they have demonstrated that serious statistical analysis of hockey is lacking, and would be of benefit, though most of the advocates are unwilling to go beyond statistical handwaving, because rigorous statistical analysis would be a mammoth undertaking.

Jonathan Willis said...

Good article, Guy.

When I was writing that piece I originally left the projections out, and just put up plus/minus like this:

2009-10: -0.622 per game

I did the same thing with the other numbers, but looking at decimals bothers me, and it's tough to read. So I projected over 82 games, with my sole motive (and I'll swear to this) being to show those decimals in a more readable format.

Or in other words, to make it more readable for people who don't like math ;)

As for Chorney himself, aside from 2005-06 (where I think it's probably fair to say he was playing a sheltered role as a rookie) his plus/minus ranked below the team average each and every season. That holds true in North Dakota, in Springfield, and in Edmonton.

I also think it comes across watching him play. His cameo in two games against Calgary last season was awful, and he's been both tentative in his own end this year and prone to gambling.

Coach pb9617 said...

One of the biggest issues I have with the math bloggers' "analysis" is that their inherent bias tends to show up, despite their best efforts to mask it.

You provided the Chorney example, showing that a certain blogger used a 32-game sampling to predict how his season would turn out. Now, if someone did that for Shawn Horcoff this season (as many did), the bloggers would be up in arms (they were) suggesting that the sampling size is "too small.
"

You mean like using a sample size of one to attack all math-based articles?

Guy Flaming said...

For the most part this hasn't turned into a pissing match between guys who use and like stats and those who do not. I'm glad because that was not the purpose for this post.

I think stats have their place but it's just not the angle I come from. But just because it's not my bag that doesn't mean that it's unworthy. Far from it. Some people love mayo which makes me puke.

Jonathan: lol, I think I actually would have understood the -.622 stat better than what you had. Go figure.

Jfry said...

i think jon's larger thesis was that for two years straight, chorney has put up horrendous minor league numbers that don't really jive with the NHL opportunity he's been given.

putting it in 60min chunks or 82 game chunks, or whatever, is just a way of presenting info on a scale people are used to...

i think jon also lets some of the numbers make his point...sort of a "picture" is worth a thousand words, except here it's "stats".

Ribs said...

You mean like using a sample size of one to attack all math-based articles?

Hee Hee.

Hawerchuk said...

Guy,

Interesting post. One thing I wonder is what you mean by "math". Things like "Corsi number" or "Scoring chances" involve no math. It's just somebody sitting there and counting shots and making a judgment call. Not very mathematical.

The next level up is things like Quality of Competition. But Quality of Competition doesn't really involve much more than arithmetic. I've met with various hockey people to describe the method and they usually say: "Where's the math?"

Or do you mean things like Vic's beta distributions?

Guy Flaming said...

Math is being used as a broad, simplified term for stats/numbers.

mc79hockey said...

With all due respect, I sort of think that declaring yourself to be a math/stats/numbers atheist is basically tantamount to saying that you can't be bothered to take a serious part in the discussion.

I understand what you and some of the other prospophiles do Guy. I get that there are people who are interested in the narrative and the story and less so in the nuts and bolts of the game. I'm not here to tell people that they have to understand the game a certain way.

What fucking galls me though is that by limiting yourself in this way, you're basically checking out when it comes to the sanity check "Is this true?" factual type stuff that the numbers give us. You wrote a series of pieces last year about KP in which you kind of tagged MacT. I went to the numbers which kind of made your case look a little weak, in my opinion. This season's kind of been a data point that way too.

Again, like I say, enjoy the game however you like to enjoy the game, but if you're choosing to ignore ths stuff, you're basically opening yourself up to all sorts of errors. It's valuable info.

Guy Flaming said...

It is valuable info, made that point several times. If I was running a NHL team I'd have someone like you on staff to chime in from that stats analysis angle.

That's not my department though. I didn't get the value in Jonathan's analysis in that one small example. That's it.

I have no problem with admitting stats have their place. It's just not something you'll find coming from me very often.

I leave that to you guys because you do it way better than I possibly can.

D said...

Hi Guy
I always enjoy your work and I enjoyed your humour in this post (love Calvin & Hobbs); I got what you were doing and I didn't think you were dissing anyone. Just my impression.

I also appreciate what Tyler et al do, don't understand it all well, I do try, have enough intelligence to get the gist of it but not enough to join conversations. I've done some stats (in another field) and understand they are or can be possible indicators, trends, etc. There's always a % of error but the info is still useful in the big picture.

Anyway, hope to see Eberle play a few games, maybe the Worlds too before summer.