Papers around the world, from the Chicago Tribune to the Daily Telegraph to the Times of India, along with respectable blogs such as Science Daily, have likely terrified hundreds, if not thousands, of parents. All are reporting on an empirical paper making an eye-catching claim: give your male child an unpopular name, and you have condemned him to a life of crime. Amazon must be doing brisk business selling baby-naming books today.
To all those anxious parents and parents-to-be I can say only this: put these books down. Everything is going to be fine. The paper — "First Names and Crime: Does Unpopularity Spell Trouble?," written by two economists at Shippensburg University, David Karlist and Daniel Lee — is so flawed that it should not have been published and, if published, ignored. But it has been published and--thanks in part to a marketing campaign by Social Science Quarterly, the journal that published it--not ignored, and that raises two problems.
First, the paper's conclusion is sufficiently simple and punchy, and perhaps panders just enough to lurking prejudices in people's minds, that it has the potential to spread like wildfire. It is essential to douse these flames quickly. And second, its growing success provides a classic example of what happens when bad-but-catchy science comes in contact with a scientifically unsophisticated media. This case may be a rather small one, but it is indicative of a much bigger problem that must be fixed.
I could spend hours wonkishly pointing out the flaws in the article, but that would likely bore 99% of you. So here's the nickel tour of its problems. The paper calculates two frequencies — the frequency of males born in an unnamed state between 1987 and 1991 with a particular name, and the frequency of juvenile defendants successfully prosecuted between 1997 and 2005 in a subset of counties in that state who have the same name — and then computes the correlation between the two. That's it. The authors purport to find that a 10% decrease in a name's frequency leads approximately to a 3% increase in the name's frequency among juvenile delinquents.
Now I can guess what many of you are thinking: unusual names are not randomly given out. Isn't it possible that people with highly unusual names tend to come from groups that might have higher crime rates, like the poor, or single-parent households, or households with younger parents? Absolutely. And Karlist and Lee find that they do: children with unpopular names are more likely to come from single-parent families and to live in counties with higher unemployment rates, higher per capita state assistance payments, and lower per capita incomes. But — without ever saying why — the authors never use any of that information when computing the importance of a person's name. And this omission renders their study effectively meaningless.* It is impossible to know how to interpret the correlation they produce.
The authors try to have it both ways. On the one hand, they acknowledge that the child's name itself may not be driving the behavior, but may reflect these socio-economic conditions that also influence the choice of names (in which case they have found nothing more than a spurious correlation). Yet they also hypothesize that having an unpopular name could constrain employment options or affect self-worth and thus actually cause criminal behavior directly (but this is the very effect their statistical errors prevent them from observing).
It is, of course, possible that the choice of name directly influences criminal behavior, or does so by, say, limiting employment options. But the simple correlation in this paper by no means establishes such a fact. And it certainly does not justify the authors' suggestion that authorities should consider profiling by name. Put aside any moral concerns you may have with profiling. Just from a practical perspective, this simple test does not indicate that names are an effective indicator of future criminality.**
That a journal publishes a weak study, that peer review once again fails to filter out the wheat from the chaff, is nothing new. So why single out this one study? Because it points to a deeper problem.
Universities and think tanks are producing more empirical work today than ever before. And on the one hand, this can be a tremendous boon to us: how can we understand how the world operates without scientific evidence? But there is an ugly downside. As the volume of empirical work has grown, the ratio of good studies to bad has almost surely fallen. The good stuff has never been better, but much of the bad stuff has never been worse, and the bad is growing faster than the good.
The result is a baffled populace. Does wine cause or cure cancer? Does the death penalty deter? Should I take hormone replacements or not? Should I choose a boring, common name for my child? It is impossible for the layperson to ever know what to believe, and it must undermine her confidence in the sciences, not to mention in her ability to make choices about how to live her life, from what to eat for dinner to what to name a child. Who is to blame for this?
It would be easy to blame the media. It is, after all, the press that elevated this article to national attention, and it is our newspapers that tell us on Monday that something causes cancer and then on Wednesday that it cures it. And, as this little episode indicates, reporters certainly did plenty wrong. The headlines, for example, have played up the causal angle: "Adolescents with Unpopular Names More Prone to Committing Crime," or "Unpopular Names Tied to Criminal Acts." Even if the story later notes the study's nuances, the headline is what sticks with people. And there are other red flags clearly flying: a press release couched in tentative language, and an empirical model so primitive that any journalist on a science-related beat should have sensed that its conclusions likely overtaxed its method.
But in the end, I think blame must rest with the empirical fields themselves. There are two essential tasks empiricists, especially social scientists, do surprisingly poorly. First, no one has developed effective and rigorous means for assessing whether a study is good or bad. Fields such as medicine and epidemiology have taken great strides in this direction (as demonstrated by the evidence based medicine movement), but the social sciences have indifferently ignored this revolution.
Second, no one produces comprehensive assessments of what an entire collection of articles tells us as a whole. This is unfortunate, since a single article tells us very little: knowledge comes from an entire literature. But most social scientists are content to focus on the claims made by a handful of studies they personally like.
It's not surprising, then, that the press falls into the "scientific finding of the week" trap. It's the same way the social sciences themselves say what is going on. With poor measures of quality, bad studies easily work their way into journals. And with a culture that focuses on the individual study, these papers can quickly take on large lives of their own. And when that happens, it is tough to put the genie back in the bottle.
The solution to the bad-science-story problem, then, rests in the hands of the social sciences themselves: we must get our own house in order. But in the meantime, the press should approach empirical work with more jaundiced eyes and less sensational headlines. And Alec Baldwin's and Malcolm Gladwell's parents should sleep a little more easily tonight.
*Here's a simple example to show the problem. Imagine that some trait x is highly correlated with criminal behavior: 80% of those with x commit crime, but only 10% of those without it commit crime. Also imagine there is a trait y that has no effect on criminality at all but is highly correlated with x: 75% of those with x also have y, but 0% of those without x have y. If we have a population of 1000 people, split 500/500 between those with and without x, we will see the following. Of the 500 with x, 400 (80%) will commit crimes, and 300 (75%) of these will also have y. Of the 500 without x, only 50 (10%) will commit crimes, and none of them will have y. Thus of the 450 criminals we see, 300 have y, and of the 550 non-criminals only 75 (75% of the 100 x's who do not commit crimes) have y. If we ignore x--just like the authors ignore other factors correlated with names that predict criminal conduct--it looks like y is strongly correlated with criminal behavior, even though it has no effect on crime whatsoever.
**Just because boys named Garland (one of the rarest names) are more likely to commit crime than those named Michael (the most popular) does not tell us anything about the absolute fraction of those named Garland who commit crimes. Without that information--which the paper does not provide--it is impossible to know if there is any value at all to profiling by name.