Sabermetric Research: Are early NFL draft picks no better than late draft picks? Part I

Monday, February 27, 2012

Are early NFL draft picks no better than late draft picks? Part I

Dave Berri thinks that NFL teams are inexplicably useless in how they evaluate quarterback draft choices. He believes this is true because of data he presents in a 2009 study, co-written with Rob Simmons, in the Journal of Productivity Analysis. The study is called "Catching the Draft: on the process of selecting quarterbacks in the National Football League amateur draft."

The study was in the news a couple of years ago, gaining a little bit of fame in the mainstream media when bestselling author Malcolm Gladwell debated it with Steven Pinker, the noted author and evolutionary psychologist.

In his book "What the Dog Saw," Gladwell wrote,

"... Berri and Simmons found no connection between where a quarterback was taken in the draft -- that is, how highly he was rated on the basis of his college performance -- and how well he played in the pros."

Pinker, reviewing the Gladwell book in the New York Times, flatly disagreed.

"It is simply not true that a quarterback’s rank in the draft is uncorrelated with his success in the pros."

Gladwell wrote to Pinker, asking for evidence that would contradict Berri and Simmons' peer-reviewed published study. Pinker referred Gladwell to some internet analyses, one of which was from Steve Sailer. Gladwell was not convinced, but responded mostly with ad hominem attacks and deferrals to the credentialized:

"Sailer, for the uninitiated, is a California blogger with a marketing background who is best known for his belief that black people are intellectually inferior to white people. Sailer’s “proof” of the connection between draft position and performance is, I’m sure Pinker would agree, crude: his key variable is how many times a player has been named to the Pro Bowl. Pinker’s second source was a blog post, based on four years of data, written by someone who runs a pre-employment testing company, who also failed to appreciate—as far as I can tell (the key part of the blog post is only a paragraph long)—the distinction between aggregate and per-play performance. Pinker’s third source was an article in the Columbia Journalism Review, prompted by my essay, that made an argument partly based on a link to a blog called “Niners Nation." I have enormous respect for Professor Pinker, and his description of me as “minor genius” made even my mother blush. But maybe on the question of subjects like quarterbacks, we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google."

Pinker replied:

"Gladwell is right, of course, to privilege peer-reviewed articles over blogs. But sports is a topic in which any academic must answer to an army of statistics-savvy amateurs, and in this instance, I judged, the bloggers were correct."

And, yes, the bloggers *were* correct. They pointed out a huge, huge problem with the Berri/Simmons study. It ignored QBs who didn't play.

----

As you'd expect, the early draft choices got a lot more playing time than the later ones. Even disregarding seasons where they didn't play at all, and even *games* where they didn't play at all, the late choices were only involved in 1/4 as many plays as the early choices. Berri and Simmons don't think that's a problem. They argue -- as does Gladwell -- that we should just assume the guys who played less, or didn't play at all, are just as good as the guys who did play. We should just disregard the opinions of the coaches, who decided they weren't good enough.

That's silly, isn' t it? I mean, it's not logically impossible, but it defies common sense. At least you should need some evidence for it, instead of just blithely accepting it as a given.

And, in any case, there's an obvious, reasonable alternative model that doesn't force you to second-guess the professionals quite as much. That is: maybe early draft choices aren't taken because they're expected to be *better* superstars, but because they're expected to be *more likely* to be superstars.

Suppose there is a two-round draft, and a bunch of lottery tickets. Half the tickets have a 20% chance of winning $10, and the other half have a 5% chance of winning $10. If the scouts are good at identifying the better tickets, everyone will get a 20% ticket in the first round, and a 5% ticket in the second round.

Obviously, the first round is better than the second round. It has four times as many winners. But, just as obviously, if you look at only the tickets that win, they look equal -- they were worth $10 each.

Similarly for quarterbacks. Suppose, in the first round, you get 5 superstar quarterbacks and 5 good ones. In the last round, you get only one of each. By Berri's logic, the first round is no better than the last round! Because, the 10 guys from the first round had exactly the same aggregate statistics, per play, as the 2 guys from the last round.

I don't see why Gladwell doesn't get it, that the results are tainted by the selective sampling.

Anyway, others have written about this better than I have. Brian Burke, for instance, has a nice summary.

Also, Google "Berri Gladwell" for more of the debate.

-----

The reason I bring this up now is that, a couple of days ago, Berri reiterated his findings on "Freakonomics":

"We should certainly expect that if [Andrew] Luck and [Robert] Griffin III are taken in the first few picks of the draft, they will get to play more than those taken later. But when we consider per-play performance (or when we control for the added playing time top picks receive), where a quarterback is drafted doesn’t seem to predict future performance."

What he's saying is that Andrew Luck, who is widely considered to be the best QB prospect in the world, is not likely to perform much better than a last-round QB pick, if only you gave that last pick some playing time.

Presumably, Berri would jump at the chance to trade Luck for two last-round picks. That's the logical consequence of what he's arguing.

-----

Anyway, I actually hadn't looked at Berri's paper (.PDF) until a couple of days ago, when that Freakonomics post came out. Now that I've looked at the data, I see there are other arguments to be made. That is: even if, against your better judgment, you accept that the unknowns who never got to play are just as good as the ones who did ... well, even then, Berri and Simmons's data STILL don't show that late picks are as good as early picks.

I'll get into the details next post.

-----

UPDATE: That next post, Part II, is here. Part III is here.

Labels: Berri, draft, football, freakonomics, NFL

16 Comments:

At Tuesday, February 28, 2012 1:30:00 AM, Anonymous said...: Of course the only thing that a team should care about is the total "major league" contribution of a draftee, and not how he plays if and when he plays.

However...

I would also expect that if draft round is a proxy for talent or potential talent, that the "per play" performance (rate of performance in the "major leagues") of early draftees should exceed that of late round draftees. It just makes no sense for draft evaluators to be able to identify "talent" only in terms of chances of making it to the "majors" but in terms of the quality of that play in the majors.

For MLB, it is definitely true that the earlier the draft pick, the better the players play if and when they make it to the majors, although research shows that that relationship is not linear and in fact there may be little correlation at all beyond the first couple of rounds.

So if it is true that Berri et al., found that the "per play" performance is the same no matter where and when you were drafted, that is indeed an interesting and surprising finding, although I would definitely not go so far as to conclude that the players who did not play or played very little would have been just as good as if they had played as much as the ones who were drafted early and ended up playing a lot (in the "majors").

Plus, if they found that players (QB I guess, in their study) who only played a little (and were drafted late) played just as well as those who played a lot (and were drafted early), why are these players not playing more? Certainly in baseball part-time players or players who only played for a brief period of time at all, play a lot worse on a "per play" basis than regulars.

So this whole thing makes little sense to me. I'll have to re-read Berri's study though and the rebuttals.

MGL
At Tuesday, February 28, 2012 1:41:00 AM, Anonymous said...: Why did Gladwell say this?

"I have enormous respect for Professor Pinker, and his description of me as “minor genius” made even my mother blush."

This is what Pinker wrote. He was not referring to Gladwell.

"A third of the essays are portraits of “minor geniuses” — impassioned oddballs loosely connected to cultural trends."

BTW, can the captchas be any more difficult for this blog, Phil? I mean, you would think that this was Wikkileaks or something...

MGL
At Tuesday, February 28, 2012 1:43:00 AM, Anonymous said...: OK, never mind the last post. Pinker also wrote:

"The themes of the collection are a good way to characterize Gladwell himself: a minor genius who unwittingly demonstrates the hazards of statistical reasoning and who occasionally blunders into spectacular failures."
At Tuesday, February 28, 2012 10:10:00 AM, Micah said...: Does the Berri/Simmons paper argue that draft position does not predict performance? I have only so far read the abstract, which mentions a "loose correlation."

I think you're right that playing time is forgotten as a proxy for talent; if you ignore the late-drafted players who don't play then the Tom Brady (Trent Green, Marc Bulger, etc) type players may skew the data with a survivorship bias.

I'll try to look more at this, but some basic metrics for QBs selected in the draft 1994-2001: 222 total QBs. 12.5 Median per year. By round: 45, 20, 22, 24, 29, 42. This by round breakdown is very interesting: a sharp decline from round 1 to round 2, with a steady increase to the later rounds. In aggregate, most QBs are selected in rounds 6-7 (by a large margin). In the end, I'm afraid there may not be enough data to reject the null hypothesis (since so many drafted QBs do not play).

If the Berri/Simmons null hypothesis is that QBs drafted higher have higher value/should have better production, I would like to see how this is rejected. If the null hypothesis is that all QBs have equivalent talent but playing time is correlated to draft position (and success/performance), then this may be difficult to disprove but I would like to see the logic behind this H0. I'm not convinced that it makes much sense at face value.
At Tuesday, February 28, 2012 11:18:00 AM, Alex said...: If early round picks are generally better than late round picks, shouldn't teams with more early round picks wins more? Because I have not found that to be true.
At Tuesday, February 28, 2012 2:34:00 PM, K. Medvedovsky said...: re: Alex. I'm confused - shouldn't every team have more or less the same number of early round picks as late round picks? And when a team lacks an early round pick, it is usually because they traded it (presumably for fair value), so you still wouldn't expect to see a relationship.

But maybe I misunderstood you?
At Tuesday, February 28, 2012 4:06:00 PM, Anonymous said...: "I don't see why Gladwell doesn't get it, that the results are tainted by the selective sampling."

You mean you haven't realized that Gladwell's grasp of scientific principles is, um, tenuous? Pinker's full quotation sums it up pretty nicely. The "minor genius" Gladwell possesses is his ability to write in a lively way at a 6th-grade level, and to brand things in a way that sticks in people's minds.
At Tuesday, February 28, 2012 4:15:00 PM, Alex said...: K - Teams might end up with about the same number in each round, but they can also trade picks and acquire players drafted in earlier seasons, so you end up with some variability across teams for where their players came from in the draft. To be more specific, I was talking about a post I did on my blog: http://sportskeptic.wordpress.com/2011/12/08/what-nfl-winners-look-like/ . Basically it looks like teams get players (through the draft or not), and then play the ones that help them regardless of where they came from. I think both sides are probably right: players taken early in the draft are broadly better than those taken later. But once those players are evaluated and assembled into teams, you basically just have the guys who can play regardless of where they were drafted and so draft position doesn't matter on aggregate.
At Tuesday, February 28, 2012 5:54:00 PM, Don Coffin said...: The analysis in the paper is one of the clearest examples I've seen of selection bias. By confining the analysis only to QBs who actually played (enough) to pass the threshhold for their analysis, they have, as you point out, tilted the scales in favor of finding that draft position does not matter.

Doing the analysis properly is considerably more difficult, but not impossible.
At Tuesday, February 28, 2012 6:24:00 PM, Micah said...: Alex, yes, I think your default position has to be that higher draft picks have higher value (are better). That may sound tautological, but let me offer two lines of reason for this null hypothesis.

1. Draft picks are exclusive. Once you make a selection, no other team may replicate your selection. As a selector, your highest preferences (values) must go first.
2. Higher draft picks get paid more. You pay more for the things you value more.

Now, whether or not teams are stupid (or only some teams are) is a different conversation, but there is no reason to expect that the null hypothesis is anything other than: the best players (talent/value/performance) are selected highest in the draft.
At Tuesday, February 28, 2012 6:49:00 PM, Anonymous said...: The principle argument against Berri’s thesis is this:

For the players who do play and meet Berri’s min amount of playing time to be included in the data, it makes no difference where they were drafted, but not because scouts can’t discern their talent, but only because most of the later round draft choices never get any (or meet the min) playing time (which is likely true), and…

That players from the later rounds who are deemed good enough to play at least at Berri’s minimum thresh hold, end up playing just as well as early round draft picks who also end up playing.

That is the part that I take umbrage with. I would expect that if scouts could discern talent in the draft, the later round picks who do play should play worse than the early round picks who also play, AND they will play less of course (again, Berri is NOT arguing that they don’t play less; he is arguing that when they do play, they play just as well - which brings up an interesting question of why are they not playing more).

So I looked at this in baseball. I am not sure why it should be any different than football. If the Berri naysayers are right, later round draft picks in baseball who play some minimum amount of time should play just as well as early round draft picks who also play the same minimum amount.

So I looked at all pitcher (later I looked at batters) draft picks since 2000 and their performance in the majors if they pitched at least 50 IP in a season in the majors. I chose 50 since it is similar to Berri’s “4 game” minimum in football. I also scaled relievers and starters to the same baseline by adding .82 runs to their ERC (component ERA) for all relief appearances. ERC was also normalized to 4.00 for each year. IOW, 4.00 is always league average for NL and for AL in each year.
At Tuesday, February 28, 2012 6:50:00 PM, Anonymous said...: Here are the results when I weight by IP.

First 10 picks
N=146 IP= 22720 ERC=3.96

Next 10 picks
N=113 IP= 15894 ERC=4.13

Rest of 1st round
N=167 IP= 20181 ERC=4.18

2nd round
N=156 IP= 17516 ERC=4.29

3rd round
N=111 IP= 14321 ERC=4.40

4th round or higher
N=138 IP= 15302 ERC=4.36

1st 10 picks: 3.96
everyone else:4.26

Here are the results when I don’t weight by IP, but give each pitcher equal weight:

First 10 picks
N=146 IP= 22720 ERC=4.17

Next 10 picks
N=113 IP= 15894 ERC=4.22

Rest of 1st round
N=167 IP= 20181 ERC=4.23

2nd round
N=156 IP= 17516 ERC=4.37

3rd round
N=111 IP= 14321 ERC=4.54

4th round or higher
N=138 IP= 15302 ERC=4.37

1st 10 picks: 4.17
everyone else:4.34

Let’s do the same thing for batters. Since the draft is much more of a crap shoot for pitchers, I would expect to see more of a correlation between “per play” performance and draft order for those batters, as opposed to pitchers, who did play some minimum amount of time.

For batters, I only compiled those who had at least 100 PA in a season.

Here are the results, in WOBA, weighted by PA:

First 10 picks
N=207 PA= 98869 WOBA=.360

Next 10 picks
N=143 PA= 62637 WOBA=.342

Rest of 1st round
N=152 PA= 57878 WOBA=.335

2nd round
N=209 PA= 82281 WOBA=.346

3rd round
N=133 PA= 50350 WOBA=.339

4th round or higher
N=201 PA= 69874 WOBA=.334

1st 10 picks
.360

Everyone else
.340

Here are the results, in WOBA, not weighted by PA, each player getting the same weight:

First 10 picks
N=207 PA= 98869 WOBA=.350

Next 10 picks
N=143 PA= 62637 WOBA=.335

Rest of 1st round
N=152 PA= 57878 WOBA=.323

2nd round
N=209 PA= 82281 WOBA=.334

3rd round
N=133 PA= 50350 WOBA=.332

4th round or higher
N=201 PA= 69874 WOBA=.326

1st 10 picks
.350

Everyone else
.330

So clearly the first 10 picks for batters, and to some extent pitchers, perform, on a “per play” basis, significantly better than the rest of the draft picks, and especially the 4th round and higher.

This is true even though I included the “top 10%” (or whatever the percentage is) of late round picks, the ones that made it to the majors and pitched at least 50 innings or batted for at least 100 PA, per season in the seasons I included.

So the argument in football, which rebuts Berri’s work, that, “Sure, the players who were drafted in the later rounds, yet who managed to make it to the NFL and play some minimum amount of time (4 games per season in Berri’s case) are expected to play as well, on a per-play basis, as the ones who were drafted in the earlier rounds (and who also played of course),” apparently does not hold any water for baseball hitters or pitchers.

Why would we expect it to be true in football but not in baseball, unless the scouts are NOT doing such a good job in evaluating players for the draft, which is Berri’s thesis?

MGL
At Tuesday, February 28, 2012 7:26:00 PM, Don Coffin said...: Let's suppose that scouting (i.e., observation in a non-professional-performance setting) is accurate enough to identify correctly the mean of different groupe of players. Within each group there is some variation which is unobservable t the level of the individual player. And let's suppose that these distributions overlap. For simplicity, let's have just two groups.

We have to make the assumptions of variation within groups and overlap between groups. Without those assumptions, then scouting would always correctly identify player talent and correctly predict player performance, and the whole question would go away.

Then what happens? Most of the "best" players will be in the identified high-talent group. But because of the overlap, some of the "best" players will actually have been identified as being in the low-talent group.

Now all we need to do is assume that teams know that scouting does not perfectly identify talent or perfectly predice performance, so they give players in the low-talent group a chance. (In baseball, this chance comes mostly, but not entirely, in minor league play. In football and basketball, the chances come in other ways.)

From the low-talent group, then, through a process of additional observation in a structured setting, those players with actual high-but-unobserved-talent emerge. Similarly, players identified as high-talent, but who were actually low-talent, will also reveal themselves.

All of this will take time, and because of commitment effects may take a long time for players who had been identified as really high-talent players (Ryan Leaf leaps to mind).

Far as I can see all we have here is decision-making under uncertainty, with opportunities to acquire and assess more information.

Now, if you told me that success rates of high-round and low-round draftees were the same (not that the performance of successful high-round and low-round drafteer were the same), I'd be shocked.
At Wednesday, February 29, 2012 1:41:00 AM, Anonymous said...: "Now, if you told me that success rates of high-round and low-round draftees were the same (not that the performance of successful high-round and low-round drafteer were the same), I'd be shocked."

You should be shocked at both. As I've shown in the baseball study, and as you would expect from your scenario, which is correct, you should find that high round picks not only make it to the "majors" less often, but that when they do, they perform worse.

There is absolutely no reason why they should not perform worse, even with substantial overlap between the two groups.

Now, the reason why they don't perform substantially worse is because of the overlap and the uncertainty, but scouts would have to do a really bad job if you could not see a differentiation in performance.

I would have been shocked if I didn't see at least the differentiation that I saw in the baseball data. In fact, I would have assumed that I made a mistake in the research. There is simply NO way that a baseball scout can look at high school and college performance, body type and physical skills (and whatever else they look at), and have no idea who is going to perform better if and when they make it to MLB. No chance. Zero.

So I don't know why this would be the case in the NFL unless, as Berri suggests, they are doing a poor job and/or looking at the wrong things and/or it is much more difficult to predict NFL performance than other sports...
At Wednesday, February 29, 2012 10:17:00 AM, Alex said...: anonymous - half of Berri's paper addresses exactly that issue. Very few college/combine stats predict NFL performance, but draft position is based on some of those numbers. So players are drafted according to what they do in the combine, for example, even though that has nothing to do with how they'll actually play in the league (the Harvard sports blog just did a couple posts on the combine as well). If baseball does a better job of drafting according to predictive numbers, then you should get the results you report.
At Wednesday, February 29, 2012 1:24:00 PM, Guy said...: Alex:
The problem is that Berri's Combine analysis has the same selective sampling bias. The scouting data has no correlation with per-play NFL stats, because only those QBs who prove to be good get to meet Berri's playing time threshhold. But you would find that those same stats are very powerful predictors of how likely a player is to have a substantial NFL career, or make the Pro Bowl, etc. So in fact, teams are correct to pay attention to these factors.

Berri's exercise would be like studying the correlation between weight and performance by NFL linemen, discovering little correlation, and then concluding teams are foolish to care about weight in drafting linemen.

Sabermetric Research

Monday, February 27, 2012

Are early NFL draft picks no better than late draft picks? Part I

16 Comments:

About Me

Previous Posts