Sabermetric Research: Factors influencing home field advantage

The more games in a season, the more likely the best teams will rise to the top, and the worse teams will fall to the bottom. That's just common sense, and the law of large numbers.

Similarly, the more innings in a game, the more likely the best team will win. If you put the whole season into a single 1,458-inning game, there's no doubt that (for instance) the Yankees would beat the Twins.

Home field advantage (HFA) is one of those things that makes teams better. And so, the longer the game, the more likely the home team's advantage will show up in the results. HFA for a 3-inning game would be smaller than HFA for a 9-inning game.

So, it's easy to compare HFA within one given sport. But how do you compare two? According to "Scorecasting," from 1989 to 1999, in the NBA, home teams went .605. In the NHL, they went .557. Why the difference?

In the past, I've used an argument that I now somewhat regret. It went something like this: "We know that a longer game means a higher HFA. Therefore, basketball must be "longer" than hockey in some sense. Perhaps there are more confrontations between players, or something, which allows the HFA to expose itself more easily."

But now, I think, that line of thinking is too vague. It's almost a circular argument. "Why is the NBA higher?" "Because the game is longer." "What do you mean by longer?" "I don't know exactly, but it's the attribute of NBA games that makes home field advantage bigger."

It's like, suppose we don't know what causes lung cancer, except smoking. And then we find a country that has a high rate of lung cancer, even though they don't smoke much. Do we say, "that country must be somehow 'cigarettier'?" That would be silly.

And, in any case, we can do better. There are identifiable reasons why the NBA home record is higher than the NHL home record. They don't solve the problem entirely, but at least they're concrete factors.

-------

I'm going to start by calculating the theoretical HFA for the National Hockey League, step by step.

From 1980-81 to 1984-85, the home team outscored the visiting team by .70619 goals per game.

The home team scored an average 4.264 goals per game. Since it's typically assumed that goals have a Poisson distribution, the SD of goals per game is the square root of that, or 2.065. (It's a property of the Poisson distribution that the SD is the square root of the mean.)

The visiting team scored an average of 3.557 goals, for an SD of 1.886.

So, the SD of (home team - visiting team) is 2.797.

Therefore, if the two teams were exactly equal, the goal differential would be a normal curve with mean 0, and SD 2.797.

But the HFA makes those two teams unequal, by .76019 goals. Divide that by 2.797 and we see that they're unequal by 0.271 of an SD. Therefore, the home team wins if the random outcome is greater than -0.271 SDs.

Going to a normal distribution table, that probability is 0.607.

In those actual NHL games, the home team actually had a winning percentage of .592. Not bad!

Why is our theoretical estimate too high? Well, one reason is that our calculation assumed two equally talented teams. But in real life, there are always differences in talent, sometimes large ones.. And HFA decreases as the talent gets more uneven. (If an .000 team plays a 1.000 team, the HFA is obviously zero.)

So, that's one reason our estimate is too high. It's probably not all of it.

-------

Now, let's go back to game length. We all agree that if we increased the length of an NHL game, say from 60 minutes to 120, the HFA would increase.

Suppose the league does that. But, at the same time, it decides to also reduce the number of goals scored. Now, every time a goal is scored in the six-period game, the referee flips a coin. If it's heads, the goal stands. If it's tails, the goal doesn't count.

That means the average game score is the same. The distribution of goal differential is the same. The only thing that's different is the length of the game -- the number of confrontations between players, and the length of time one team has to show it's superior to the other team.

So, we should expect the HFA to go up, right?

It doesn't. It stays the same. (actually, it goes down a bit, but never mind.)

In the old NHL, home goals had a Poisson distribution with mean 3.557. And in the new NHL, home goals *also* has a Poisson distribution with mean 3.557. The distributions are identical, because Poisson applies (as an approximation) to any rare events. Whether it's over 60 minutes or 120 minutes, 3.557 goals qualifies as rare.

So if we repeat the calculation for HFA, every step is exactly the same as before! And so we get the same answer.

-------

So if "length of the game," in terms of confrontations or action, doesn't matter, what *does* matter?

Goals. The more goals scored, the higher the Poisson mean, and so the higher the SD of game results. That means more randomness, a wider spread. If there's a wider spread, that means the HFA of 0.706 goals is smaller relative to luck. And so, it has less opportunity to express itself, and we get a lower HFA.

Just to give you an example: suppose the average goals per team increases to 6. That means the SD of a game difference is 3.46. The difference of 0.706 goals is now only .204 of an SD, which gives you 58.1 percent of a normal curve. So the HFA drops to .581.

Goal difference is part of the reason that I got a theoretical HFA of .592, but "Scorecasting" showed an actual HFA of only .557. The Scorecasting study used the ten seasons ending 2009, when scoring was historically low. I used 1980 to 1984, when scoring was historically high.

-------

So, have we found an answer? Can we say that one reason why basketball HFA is higher than hockey HFA, is that basketball has so much more scoring? Well, yes and no. Yes, scoring is part of it, but, no, we can't use this particular argument, because basketball is not Poisson.

Indeed, non-Poisson-ness is one of the factors boosting the NBA home field advantage. As it turns out, the farther the distribution gets from ideal Poisson, the lower the random variance. And lower randomness boosts HFA, by providing less noise to drown out the HFA's signal.

If the NBA switched to Poisson, by making the game 20 times longer, and making baskets 20 times harder to achieve, HFA would go down, even though scoring would stay the same.

Well, not necessarily. It depends whether teams change the way they play under the new system. The home advantage in the NBA is, what, 3 or 4 points a game? If the hoop became 20 times harder to hit, that 3 or 4 points might change to something else entirely, and we'd have to recalculate.

--------

So we have two factors affecting HFA so far, all else being equal:

1. Non-Poisson-ness increases HFA.
2. More scoring decreases HFA.

--------

You can probably think of more factors. I've got a couple I'll save for a future post.

Labels: basketball, hockey, home field advantage, NBA, NHL, statistics

7 Comments:

At Thursday, May 10, 2012 9:49:00 AM, Alex said...: The home field work I'm aware of (namely http://www.advancednflstats.com/2008/09/home-field-advantage-by-quarter.html) points out that HFA starts high and decreases over the course of the game. Doesn't that suggest that shorter games should give the home team a better chance of winning, not long games?
At Friday, May 11, 2012 8:32:00 AM, Phil Birnbaum said...: HFA starts high and decreases on a per-quarter basis, but it's still positive in every quarter. So, the HFA over the whole game is still larger than the HFA for the first quarter alone.

For instance, if the HFA in quarters is 2, 1, 1, and 1 point, the HFA after the first quarter is 2 points, but after the whole game it's 5 points.

That's the HFA in terms of points. In terms of winning percentage, you need to compare the points to the SD of score differential.

If I've got this right ... as long as the HFA in points rises faster than the square root of game length, the HFA of winning percentage rises too.

In this case, points rise by a factor of 2.5 (from 2 points to 5 points), but square root of length rises by 2 (from 1 quarter to 4 quarters). So, HFA in terms of winning percentage rises too.
At Friday, May 11, 2012 9:04:00 PM, Anonymous said...: You seem to be assuming that total scoring can increase without changing the amount the home team outscores the visitor. Isn't that rather unlikely? At least it would be nice to see some historical data from high scoring and low scoring periods to see if the home team still holds the same absolute advantage.
At Friday, May 11, 2012 10:05:00 PM, Phil Birnbaum said...: Yes, that's true. I'd assume that for Poisson sports, the absolute HFA would increase with scoring.

So the higher absolute advantage offsets the lower translation into HFA winning percentage.
At Wednesday, May 16, 2012 12:16:00 AM, Chris Phillips said...: Didn't hockey have ties during the time period you're looking at? I would think that that would bring the HFA closer to .500.
At Wednesday, May 16, 2012 12:19:00 AM, Phil Birnbaum said...: Chris: I counted a tie as half a win when calculating winning percentage.
At Wednesday, May 16, 2012 9:56:00 AM, Chris Phillips said...: Yes, but you've theorized that the longer a team plays, the more likely the better team will win. In those years in hockey, the weaker team could hang on for half a win and improve their win percentage, while the better team only got credit for half a win and worsened their percentage.

Also, how did the win percentage of the best teams in hockey with ties counted as half a win compare with the best win percentage of teams in sports with no ties?

<< Home

Sabermetric Research

Tuesday, May 08, 2012

Factors influencing home field advantage

7 Comments:

About Me

Previous Posts