The Bradbury aging study, re-explained (Part II)
This is a follow-up to my previous post on J.C. Bradbury's aging study ... check out that previous post first if you haven't already.
My argument was that players with shorter careers should peak earlier than players with longer careers. Bradbury disagreed. He reran his study with a lower minimum, 1000 PA instead of 5000. He found that there was "no drop".
I decided to try to run his study myself, the part where he looks at batter performance in Linear Weights. I think my results are close enough to his that they can be trusted. Skip the details unless you're really interested. I'll put them in a quote box so you can ignore them if you choose.
-----
Technical details:
Here's what I did. I took all players whose careers began in 1921 or later, and looked at their stats until the end of 2008 (even if they were still active). They had to have had a plate appearance in each of at least ten separate seasons. In seasons in which their age was 24 to 35 (as of July 1), they had to have had at least 5000 plate appearances.
Any player who did not meet the above criteria was not included in the regression. Also, the regression included only seasons from age 24-35 in which the player had at least 300 PA.
Each of those seasons was a row in the regression. The model I used was:
Z-score this season = a * age this season + b * age^2 this season + c * career average Z-score + d * player dummy + constant + error term
I didn't include dummy variables for individual seasons (Bradbury's "D" term, if you look at his paper) or park factors. I think those would change the results only slightly.
Another difference I noticed later is that when I calculated the Z-scores, I used the standard deviation only of players who were 24-35 and had 300 PA. Bradbury, I believe, used the SD of all players, regardless of PA. Again, I don't think that affects the results much (although it makes his coefficients about twice as big as mine).
Finally, I'm not 100% sure that I did exactly what Bradbury did in other respects. The study is vague about the details of the selection criteria. For instance, I'm not sure if any ten seasons qualified a player, or only ten seasons of only 300 PA. I'm not sure if the player need 300 PA every season between 24 and 35, or if that didn't matter as long as the total was over 5000. So I guessed. Also, for Linear Weights, I used a version that adjusts the out for the specific season, whereas Bradbury used -0.25 for all seasons (and compensated somewhat by having a dummy variable for league/season).
-----
Anyway, here is my best-fit equation, followed by Bradbury's:
Mine: Z = 0.760 * age - 0.0133 * age^2 - 0.901 * mean - 10.6802 + dummies
J.C.: Z = 1.322 * age - 0.0224 * age^2 - 1.205 * mean + other stuff + dummies
These equations look different, but that's mostly because Bradbury used a different definition of the Z-score. If you look at the significance levels, they're similar: for mine, about 12 SDs; for Bradbury, about 11 SDs. Bradbury might be smaller because his regression was more sophisticated, with certain corrections that likely brought the significance down.
More importantly, our estimates of peak age, which can be calculated as - ( coeff for age ) / ( 2 * coeff for age^2 ):
Mine: 28.62 peak age
J.C.: 29.41 peak age
Why the difference? My guess is that there was something different about our criteria for selecting players for the sample. Again, I don't think the difference affects the arguments to follow.
Now, this is where J.C. says he ran the regression again, for 1000PA and no 10-year-requirement, and got no difference in peak age. I did the same thing, and I *did* get a difference:
Mine, for 5000 PA: 28.62
Mine, for 1000 PA: 28.06
It looks like a small difference, only .56 years -- and the total of 28.06 is still above the previous studies' conclusion that the peak is in the 27s. However, as it turns out, the way the study is structured, that small difference is really a big difference. Let me show you.
First, I ran the same regression, but this time only for players with 3000-5000 PA:
3000-5000 PA: 27.61
So, these guys with shorter careers did have an earlier peak, about a year earlier than the guys with the longer careers. What if we now look at the guys with really short careers, 1000-3000 PA?
1000-3000 PA: 147.00
That's not a misprint: the peak came out to age 147! But the coefficients of the age curve were not close to statistical significance -- neither the age, nor the age-squared. Effectively, these guys performed almost the same regardless of age. They didn't peak at 29, but neither did they peak at 27. They just didn't peak.
And so, it's reasonable to conclude that one of the reasons the peak age dropped so little, when we added more players like Bradbury did, is that the regression wasn't able to find the peak for the players with the shorter careers. And so the sample still consists of mostly players with longer careers.
------
Can we solve this problem? Yes, I think so. The procedure cut off the sample of players at 24 and 35 years of age. If we eliminate the cutoff, the results start to work.
I reran the regression with no age restrictions: players had to have 5000 or 1000 PA anywhere in their careers, not just between 24 and 35. Also, I considered all seasons in which they had 300 PA, regardless of how old they were that year. The numbers are similar:
28.97 for 5000 PA+
28.66 for 1000 PA+
The difference is smaller now, 0.31 years. But the important result is the breakdown of the 1000+ group:
28.97 for 5000 PA+
27.72 for 3000-5000 PA
26.61 for 1000-3000 PA (now significant)
----------------------------------------
28.66 for the overall sample
It seems like the shorter the career, the earlier the peak.
But, still, the overall average seems to only have dropped 0.31 of a year, and it's still around 29 years. Isn't that still evidence against the 27 theory?
No, it's not.
Take a look at the above table again: we have three peaks, 28.97, 27.72, and 26.61. Those three numbers average to 27.77. Why, then, is the "overall" number so much higher, at 28.66?
It's because there were a lot more datapoints in the 5000 PA+ category than the others. And that makes sense. The more PA, the more seasons played. And each season gets a datapoint. So the top category is full of batters with 10 or more seasons, while the bottom category is full of batters with only a few seasons. In fact, some of them may have only 1-2 qualifying seasons of 300 PA or more.
If a player has a 15-year career, with a peak at age 29, he gets fifteen "29" entries in the database. If another player has a 3-year career with a peak of 27, he gets only three "27" entries. So instead of the result working out to 28, which is truly the average peak of the two players, it works out to 28.7.
Another way to look at it: Player A has a 12-year career. Player B has a 2-year career. What's the average career? It's 7 years, right? And you get that by averaging 12 and 2.
But the way Bradbury's study is designed, it would figure the average career is 10.57 years. Instead of averaging 12 and 2, it would average 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 2, and 2. That's not the result we're looking to find.
This is less of a problem in Bradbury's original study, because, by limiting players to 12 years of their career, and requiring them to play 10 seasons, most of the batters in the study would be between 10 and 12 years, so the weightings would be closer. Still, this feature of the study means that it's probably overestimating the peak at least a little bit, even for that sample of players.
So, anyway, if 28.66 is not the right average because of the wrong weights, how can we fix it? Simple: instead of weighting by the number of regression rows in each group, we weight by the number of players in each group:
28.97 for: 640 players with 5000+PA
27.72 for: 595 players with 3000-5000 PA
26.61 for 1148 players with 1000-3000 PA
----------------------------------------
27.52 overall average
So what looked like a small drop when we added the shorter-career players -- 0.31 years -- turns into a big drop -- 1.55 years -- when we weight the data properly.
Now, this only works when there actually IS a drop between the 5000+ and the 1000+ groups. We found a drop of 0.31. But on his blog, Bradbury said that with his data, he found no drop at all.
How come? I'm not sure. But one reason might be random variation (if he used different selection criteria). Another might be his age restriction causing nonsensical results in the important 1000-3000 group. And there are his other variables for "missed information resulting from playing conditions". Or, of course, I may have done something wrong.
------
So we're down to 27.52. That's pretty close to the traditional estimates of 27ish. But I think we're not necessarily done: there are at least two factors I can think of that suggest that the real value is lower than even 27.52.
First, we showed that the regression overestimates the peak age by overweighting long careers relative to short careers. We were able to get the average to drop from 28.66 to 27.52 just by breaking the sample down and reweighting.
By the same logic, all three groups above must also be overestimates! In the middle group, players with 5000 PA are going to be weighted 67% higher than players with only 3000 PA. If we were to rerun the regression after breaking the group down further, into (say) 3000-4000 and 4000-5000, we'd get a lower estimate than 27.52. In fact, we could break those new groups down into smaller groups, and break those groups down into smaller groups, and so on. The problem is that the sample size would get too small to get reasonable results. But I'm betting the average would drop significantly.
Second, the study leaves out players with less than 1000 PA. That's probably a good thing, because with only 1 or 2 seasons, it's hard to fit a trajectory properly. Still, it seems likely that if there were a way of figuring it out, we'd find those players would peak fairly early, bringing the average down further.
------
So, in summary:
-- If we use the Bradbury model on groups of players with fewer PA, we find that those players are estimated to have lower peak age. This supports the hypothesis that choosing only 5000+ PA players biases the result too high.
-- The model used in Bradbury's study consistently overestimates peak age for another reason. That's the weighting problem -- it figures the peak for the average *season*, not for the average *player*.
-- Correcting for that shows that if we look at players with 1000 PA, instead of just players with 5000 PA, the peak age drops to the mid 27s.
-- Other corrections that we can't make, because of sample size issues, would drop the peak age even further.
-- There is good evidence that the shorter the career, the younger the peak age.
-- It doesn't seem possible, with this method, to get a precise estimate of average peak age. "Somewhere in the low 27s" is probably the best it can do, if even that.
