Retrospective Economic Voting in the 19th Century?

A quick blog request as I listen to Larry Bartels give a summary talk on economic voting and the recent recession*: does anyone know of any studies on economic voting / retrospective voting in the 19th century? Bartels mentioned that a few papers attempted to test the idea that voters focus primarily on recent economic performance back into the 19th century, but didn’t give any names, and I’m having trouble tracking them down via Google Scholar.


* The answer is: economic voting models do really well, the 2008 and 2012 elections were very close to the historical trend line and not unusual in any way, and the same findings hold more or less across Europe.


Charisma, a New Network for Consumer Market Studies

I’m not quite sure when it launched, but the new(ish) Charisma Network for scholars of consumer markets has a website chock full of interesting tidbits – from job postings to CFPs to links to blog posts of interest. Network organizers include many of the folks behind the (recently quite quiet) SocFinance blog*. Here’s how the network describes itself:

The site offers an interdisciplinary approach to consumer market research. It is guided by the view that to properly understand the mix of devices and desires that drive markets, consumer market studies should be open to a variety of techniques, methods, theories and perspectives. Charisma therefore has an interdisciplinary, applied focus and will host a range of content including news items, events and announcements, commentaries and working papers as well as photo essays and data visualisations.

Recommended for sociologists, etc. interested in consumer markets, and especially the intersection of STS approaches with consumption.

*As a contributor, I share some of the blame for its torpor. Apologies!

Regnerus: Not Just for the Supreme Court Anymore!

Right now, the Minnesota State House is debating a same-sex marriage bill. According to this liveblog coverage from Minnesota Public Radio, one opponent of marriage equality invoked Mark Regnerus’s controversial/discredited study on the House floor, describing its results as “conclusively proven.” The study was apparently also cited in a Minnesota Senate hearing by an academic.*

I’m not sure how much it would matter, but the fact that Regnerus is still getting a lot of political play suggests that calls for a retraction might still be a reasonable political move for sociology. Opponents can, and are, calling out the study’s flaw in the MN House debate, but it’s a lot easier to point to the journal and say, the study was so flawed it was retracted.

Update: The bill just passed the House!

H/T to MM for the pointer.

*Does anyone know who the University of Minnesota faculty member that cited the Regnerus study in the Senate hearing is? There’s not much information in the MPR story. Update: The Regnerus study appears to have been discussed in the senate by Dr. Thomas Nevins, a University of Minnesota Pediatrician speaking in his individual capacity. Here’s the meeting minutes. If you go in the associated audio clip to about 1:24:35, you can hear his talk entirely about the Regnerus study, as “the most scientifically sound study” and “published in a peer-reviewed journal.”

Debating the Wrong Reinhart + Rogoff

On April 15, three UMass Amherst economists (a graduate student Thomas Herndon, and his advisors Ash and Pollin) published a critique of an influential paper, “Growth in a Time of Debt,” written by Harvard economists Carmen Reinhart and Kenneth Rogoff (R+R 2010 for short). Since then, basically everyone has weighed in on the debate – even Stephen Colbert, in this hilarious bit (and a follow-up interview with Herndon). Here’s one of many summaries of the affair.

Much of the subsequent debate has attempted to assess the original paper and the critique. I want to argue that, in some sense, this is the wrong debate. There was nothing wrong with the original R+R 2010 piece. Oh, sure, there were Excel coding errors and a questionable weighting scheme, but as many commenters have since noted, these are common in lots of early-stage research. You try things out, make mistakes, show it to your colleagues, go back and improve your methods, and science progresses. This is the argument advanced by defenders of R+R on all sides, from Greg Mankiw to Betsey Stevenson and Justin Wolfers to Jeff Smith (and even some sociologists). On a purely academic level, I agree with this argument.

But Reinhart and Rogoff didn’t just write a short conference paper. They wrote op-eds in prominent places. They spoke to policymakers. They argued that because of what they’d found in that short, not peer-reviewed piece, policymakers should fear a 90% debt/GDP threshold or cliff. They drew on their academic credibility – both personal, and in the research they had published – to try to influence policy quite directly. And for that reason I disagree strongly with Jeff Smith when he argues that Herndon, Ash, and Pollin should have shared their critique with Reinhart and Rogoff before going public and given them a chance to respond. And similarly, in some sense I agree with Greg Mankiw when he (and many others) write that the spreadsheet errors have gotten too much attention. At least, I would agree if this were purely an academic debate. But if this is a political fight, then Reinart and Rogoff’s credibility as policy experts is exactly what’s at stake. They are tenured economists at Harvard, so they start off with a lot of credibility. The spreadsheet error is important because it shows just how sloppy was the research they were shilling in 2010-2011. At the time, Krugman wrote of R+R 2010, “this just isn’t careful work.” The Excel spreadsheet error is the smoking gun proof of that.

So, yes, Herndon, Ash, and Pollin (HAP, for short) were explicitly critiquing R+R 2010, but the critique matters because of the editorials and policy tracts that R+R wrote in 2010-2012 that explicitly invoked R+R 2010 to argue for cuts in government spending or at least more attention paid to rising debt. These editorials, and similar advice given by R+R themselves directly to policymakers, shifted the terrain of the debate: this is not just an academic dispute that can take place on the usual academic time scales and with the usual academic norms, but rather an explicitly political fight about government spending in a time of recession. Here’s one example that will hopefully serve as a case-in-point, a 2011 editorial published by Bloomberg titled Too Much Debt Means the Economy Can’t Grow. Here are the key invocations of the 2010 paper:

Our empirical research on the history of financial crises and the relationship between growth and public liabilities supports the view that current debt trajectories are a risk to long-term growth and stability, with many advanced economies already reaching or exceeding the important marker of 90 percent of GDP.


In our study “Growth in a Time of Debt,” we found relatively little association between public liabilities and growth for debt levels of less than 90 percent of GDP. But burdens above 90 percent are associated with 1 percent lower median growth. Our results are based on a data set of public debt covering 44 countries for up to 200 years. The annual data set incorporates more than 3,700 observations spanning a wide range of political and historical circumstances, legal structures and monetary regimes.

We aren’t suggesting there is a bright red line at 90 percent; our results don’t imply that 89 percent is a safe debt level, or that 91 percent is necessarily catastrophic. Anyone familiar with doing empirical research understands that vulnerability to crises and anemic growth seldom depends on a single factor such as public debt. However, our study of crises shows that public obligations are often hidden and significantly larger than official figures suggest.

“Growth in a Time of Debt” was a brief, not peer-reviewed paper. That’s their evidentiary basis. And yes, they back somewhat away from the strong threshold claim – but in terms of sensitivity, as in, we know high debt somewhere around here kills growth, but we aren’t sure it’s exactly 90%. And, as they later note in defense of themselves, they do emphasize the median claim (which is robust to the Excel and weighting critiques of HAP). But the causal claim is right in the friggin’ title* and it had already been disputed by many prominent critics who read the original paper (e.g. Krugman here). Suppose you’re a Harvard professor – would you publish an op-ed which relied this heavily on a not peer-reviewed study that had already received substantial critiques for overstating its causal claims? If you did, would you expect the same kind of courtesy from your colleagues as if you’d just posted a paper on SSRN or NBER?

To me, Herndon, Ash, and Pollin are responding to this op-ed (and others like it) as much or more than they are responding to R+R 2010 itself. Responding to an academic working paper may have some norms of fair warning associated with it. Responding to a series of hackish op-eds drawing legitimacy from a working paper that looks like a publication doesn’t have the same norms or goals. It’s about destroying credibility, not improving flawed methods. Social science always has an element of politics – we are, after all, making knowledge about people. But R+R, much like the equally contentious Regnerus affair, was politicized in a more transparent, more partisan way. Regnerus at least somehow managed to get his study through the normal peer-review process, and left it to others to make fools of themselves citing it in public for political effect by playing up a (questionably derived) correlation into a strong causal statement, leaving himself in a somewhat defensible position of never having taken a position on the political issue in public, nor of having made the bad causal claim (later undermined somewhat by behind the scenes evidence). Reinhart and Rogoff did no such thing – they took their preliminary results straight into the heart of an important political debate and made (academic) fools of themselves.

There’s always something messy and frustrating around these explicit interweavings of high-stakes politics and “normal” social science that twists everything up. But in the end, I think Mankiw, Stevenson and Wolfers, Smith, and other academics, give Reinhart and Rogoff too much credit by treating this incident as a purely academic, normal science debate. The error wasn’t (just) in the spreadsheet, it was in the attempt to claim policy relevant expertise based on the spreadsheet.

* As Mike argues in the comments, op-eds receive their final titles from editors rather than authors, so it’s hard to know who picked that title out. That said, R+R make the causal version of the claim in various places in this piece and elsewhere. As O’Brien notes, “R-R whisper “correlation” to other economists, but say “causation” to everyone else.” (This footnote was added after Mike’s comment.)

ArXiv for the Social Sciences?

In a comment thread on Scatterplot, Neal Caren pines “I wish there was an for the social sciences.” I wish this as well! I am still shocked that economics is light years ahead of sociology on circulating working papers (e.g. NBER, IZA discussion papers, people just posting nice LaTeX’d versions on their personal academic sites, etc.). Kevin Bryan of A Fine Theorem links to an ungated version of every Econ paper he reviews – and notes wistfully the lack of such papers when he touches on Sociology.

So, my question is, what would it take to make a social science arXiv happen? What if instead of founding the next journal, a group of editors/scholars got together – perhaps with the backing of a scholarly publishing office of a University amenable to such things?* – and put something like this together. Perhaps relatedly, why is SSRN not like arXiv for the social sciences? What is it missing – is it just a lack of norms of posting working papers there, or some key features around commenting, etc.?

* is hosted by Cornell, for example.

Visualizing Inequality in the US, 1947-2011

How can we best understand trends in postwar income inequality in the United States? What data are available for understanding these trends? What is the best way to represent these trends visually? In this post, I want to argue that the basic facts of income inequality over the last 65 years require a minimum of two graphs drawing on two data sources. First, I’m going to say a bit about the data, then a bit about the trends, and finally I’m going to show a few possible graphs which cover parts of the story (but none of which is perfect on its own).

Data on Income Distribution

The United States has surprisingly poor historical data about income distribution (and thus, income inequality). More recent years are covered by comprehensive survey datasets like the Panel Study on Income Dynamics. But the crucial period from the end of World War II to the 1960s is covered in only two big datasets[1]: first, the now famous Piketty and Saez data on top incomes which goes back to 1913 [2], and second, Current Population Survey data limited to measurements of family rather than household income that go back to 1947. For whatever reason, the Census historical data on household incomes only start in 1967, presumably reflecting some change in the methodology of the CPS’s annual income supplement.[3]

My favorite dataset for understanding income distribution, the CBO’s post-tax and transfer data, only go back to 1979. These data combine survey and income tax data in a way that is very difficult for researchers outside the government, along with estimates of government transfers, and they also attempt to adjust for household size and the nonlinear relationship between expenses and number of people in the household. Thus, the data are probable the best available for looking at real economic outcomes from the bottom of the distribution to the top 1%. As such, these data are the base for Lane Kenworthy’s excellent “best inequality graph.” I recommend his extensive analysis and defense of the graph (the updated version of which is below). I agree that it (or something very similar) is the best graph to cover the post-1970s period, but I will argue that at least two graphs are needed to show what happened to the whole distribution from 1947 to present.

Kenworthy (2010) Best Inequality Graph Updated

Stylized Facts of Inequality, 1947-2011
As suggested by the above graph, one of the most important (and recently discovered [4]) facts about inequality in the 20th century is the dramatic growth of incomes at the very top combined with the stagnation of real income for most of the distribution. The stagnation in wages for the middle of the distribution starts in the late 1970s/early 1980s, and persists until present. The top 20% or 10% do a bit better, and the very top (.01%) do incredibly well (as I will show in a moment). But what happened before, in the crucial postwar golden years of 1947-1978(ish)?

To me, the most salient feature of the 1940s-1970s income distribution is how every part of the distribution rose relatively equally. Specifically, between 1945 and 1978, the income threshold to be in the 20th, 40th, 60th, 80th, and 95th percentile all doubled. Fascinatingly, during this time, the incomes at the very top stagnated. These trends diverged in the 1980s – top incomes kept going up, and the very top skyrocketed, while most income stagnated.

Alright, so now we have a sense of the basic facts and the best available data. How can we best visualize them?

Two Possible Graphs

The brilliant thing about Kenworthy’s graph is that it manages to portray so viscerally the stagnation at the bottom alongside the growth at the top while using actual dollar magnitudes. When we switch to telling the whole postwar income distribution story, however, I’m not sure we can do it cleanly with actual dollar amounts. At least, the best things I’ve come up with so far involve normalizations instead. If you’d like to take your hand at it, I’m happy to provide the spreadsheet from which these graphs were generated.[5] So, the first graph tries to show the equality of gains across the distribution followed by the rupture in the late 1970s.

Source: Census.

This graph shows the threshold for the real (inflation-adjusted) 20th, 50th (median), 80th, and 95th percentiles of family income from 1947 to 2011, with 1947 set to 100. This graph shows the unified growth of incomes up through the late 1970s, and then the divergence as the median and 20th percentiles stagnate while the top quintile continue to increase. This increase levels off for both the 80th and 95th percentiles in late 1990s, and over the last decade incomes have basically been flat at all levels. But this paints a distorted picture of the very top of the income distribution. While the 95th percentile tripled since 1947, and increased by about 50% since 1980, the very top have done a lot better. So, here comes the Piketty and Saez data, mixed with a dash of not quite commensurable census data:

Sources: Census, Piketty and Saez.

We borrow the median income data from the previous graph and combine it with the top income thresholds from the Piketty and Saez dataset, all inflation-adjusted, all normalized to set 1947=100. Also, these are the Piketty and Saez data excluding capital gains (which would make the picture look even more extreme, but also less comparable as the Current Population Survey doesn’t capture capital gains well). What do we see? Median income still rises and then flattens out post-1980. The 90th percentile follows much the same trend, but flattens out a bit less. In contrast, the 99 and 99.99 percentiles behave quite differently, staying relatively flat in the 1950s-1960s, and skyrocketing in the 1980s. The trend at the very top (the 99.99th percentile) is particularly striking. These very elite, top incomes didn’t budge from 1947 to 1978. Then, they take off like gangbusters, increasing by a factor of 6 in just 30 years. The 99th percentile follows the same trend, but much less sharply.

So, together, what do these graphs show? The postwar golden era was one of rising incomes for everyone but the superrich. The 1980s-2000s saw stagnating incomes in the middle of the distribution, small gains at the top, and massive gains at the very top.

Other Possibilities

There are lots of other ways you could graph this data. You can show actual income on regular or logged scales, you can look at simple ratios (90/50) that more directly capture our understanding of inequality, and so on. I like these because they show trends very nicely, and they highlight the stylized facts that I think most usefully characterize the income distribution in this period [6]. What do you think? Suggest an alternative, or ping me for the data and plot it yourself!

Kevin Bryan, of A Fine Theorem, published a nice detailed paper on this topic in 2008. Bryan and co-author Martinez use data from the CPS, Piketty and Saez, and Social Security data which I had missed in my discussions. That paper also has some nice examples of what you can do with 90/50 and 50/10 ratios, and logged graphs. Here’s one example:

Bryan and Martinez 2008

This figures shows and then decomposes the 90/10 gap: “Figure 2 presents the evolution of log income ratios. It shows that from 1961 to 2002, the CPS March log 90-10 ratio increased from 1.23 to 1.61. The ratios computed using the CPS ORG data set behave similarly. Figure 2 also shows that the vast majority of the increase in the log 90-10 ratio is due to an increase in the 90-50 ratio.”

End Notes

[1] That I know of! Experts on income data, please come forward and let me know of any that I’ve missed!
[2] When the US first started collecting income taxes, and thus generated good data on top income earners.
[3] What’s difference between a household and a family, according to the Census? Glad you asked: “A family consists of two or more people (one of whom is the householder) related by birth, marriage, or adoption residing in the same housing unit. A household consists of all people who occupy a housing unit regardless of relationship. A household may consist of a person living alone or multiple unrelated individuals or families living together.”
[4] I am currently working on a paper / dissertation chapter on the history of income distribution data which tries to understand why it took so long for the growth in top incomes in the 1980s to become widely discussed (e.g. “the 1%” that became such a topic of academic and political interest in the 2000s). Send me an email if you’d like to read a (very) preliminary version, or attend my presentations at SASE or ASA this summer.
[5] This is the part where data vis folks can make fun of me for using Excel. I know, I’m sorry. One of these days I plan to do more than just tinker with R and actually get it to do what I want. Until then, we can all suffer through.
[6] It’s worth putting in a reminder that the thing being graphed here is the distribution of income, more specifically, the threshold needed to be in a certain part of the income distribution in different years. Individuals follow income trajectories, and don’t stay in exactly the same place over time. Questions around the stability of income within-individuals and across generations are exactly what panel studies like the PSID are designed to show. Unfortunately, as far as I know, they don’t go back much before the late 1960s. In response to criticisms along these lines, Statistics Canada has recently published some data on stability in the very top income earners (using confidential tax data) which suggests that “Four-fifths of Canadians in the top five income percentile have consistently been there in the past five years, the statistics show, and the proportion of people remaining in the upper echelons has been growing since the early 1980s.” So, the 1%, in Canada at least, is a consistent group of individuals, not simply a statistical artifact as individuals rotate in and out. The US does not publish similar official income data, nor similar data on mobility into and out of the 1%.


Kopczuk, Saez, and Song (2010) also have a nice paper on the US using Social Security data which tries to determine how much of the increase in inequality is due to transitory vs. permanent dynamics, and thus they conclude: “the evolution of annual earnings inequality over time is very close to the evolution of inequality of longer term earnings.” (94-95) Kopczuk et al. also find that those who are in the top 1% of earners in one year are 80% likely to be in the top 1% the following year, and 60% likely five years later, again suggesting that the top 1% is a meaningful group. More broadly It seems like Social Security data have real promise for producing income inequality measures and graphs going back to World War II, but have thus far been used only a handful of scholars due to their lack of public availability. One nice feature of the Social Security data is the inclusion of (a few) demographic variables including gender and race. For example, this nice graph shows that women make up only about 14% of the top 1% of income earners, and only about 22% of the top 10% of income earners, even as they make up about 44% of all workers (all data through 2004).

Kopczuk Saez and Song (2010)

IRB QOTD: Schrag, “How Talking Became Human Subjects Research”

For better or for worse, social science research is now governed by an institutional review board system that seems to have the problems and promises of medical research, and not social science, as its priority. Zachary Schrag has an excellent article on the history of social sciences and the IRB, “How Talking Became Human Subjects Research.” Schrag summarizes the argument quite vividly:

“This article draws on previously untapped manuscript materials in the National Archives that show that regulators did indeed think about the social sciences, just not very hard. … Compared to medical experimentation, the social sciences were the Rosencrantz and Guildenstern of human subjects regulation. Peripheral to the main action, they stumbled onstage and off, neglected or despised by the main characters, and arrived at a bad end.”


Ranking Programs in Sociology

If you like quantitative rankings of sociology departments, today is like some sort of holiday.*

Over at Scatterplot, Neal Caren has an analysis of top 20 placements within Sociology. As he’s careful to note, but as ought to be stated many times over, this is only an analysis how well departments do at placing their students in top 20 sociology programs, which is not an especially great measure of the success of a program.. although it’s better than nothing, especially if you want such a job. For a program like, Michigan, for example, this would miss how well (or poorly) we do at placing students in professional schools (Business, Social Work, Policy, etc.), which might be more relevant for an individual student. Neal’s results are very much in line with Burris 2004: 88% of top 20 assistant profs come from top 20 programs, as ranked by the current USNWR. Neal goes on to do some back of the envelop calculations to show:

So when you start graduate school in one of these [top 20] departments, the odds of getting an early career job in a similar department is about one in twenty. In March Madness speak, think of yourself as a 4 seed trying to win the national championship.

[A]ssuming schools ranked 20 to 50 average 10 incoming students a year, that is about 300 folks a year competing for one slot [in a top 20 program]. Those are roughly the odds that an eight seed has of winning the tournament, which has happened once so far (Villanova in 1985).

Over at OrgTheory, Kieran Healy has released the results of the All Our Ideas survey of “best” sociology department. Head over there for methodological details and results. Kieran also presents some nice graphs of vote-similarity, which interestingly places Michigan most closely with Stanford (maybe Woody Powell and Jason Owen-Smith were on OrgTheory voters’ minds?).

Neal, Kieran – there’s a (potentially) interesting merging of data that could be done here. Specifically, do top 20 placements match better with the AOI rankings or the USNWR rankings? If we assume that such rankings are somehow measuring within-discipline prestige, then it seems like this would be one way to test which measure is “better.”

*Maybe Passover? Smarch Christmas? I’m not sure.

Specific Generalities: Historical vs. Sociological Generalization

What counts as a “general” story? What determines what findings are bigger or more important than others?

I think historical research and sociological research tend to answer this question in two very different ways. For sociologists, a general story, claim, finding, whatnot is generalizable to different cases and contexts. Structural holes shape competition in inter-firm networks, but they also shape competition in interpersonal networks. And so on. Because of this, any case can be interesting if it serves as a model for other similar cases might work.

In history, or at least my outsider impression of it, an important story is one that is empirically “big.” If a claim characterizes a long period of time or covers an event that touches a lot of people, then it’s a big, important, general claim. This doesn’t mean that you can’t study a small event – a single protest, a single court case, whatever – but you make claims about its importance by arguing that the small event characterizes a big system or process. What you don’t claim, or at least don’t always claim, is that your small event is a case of a whole class of phenomena.

So, for example, I think of my own work on the history of national income statistics as being a “big story” because national income statistics are a worldwide phenomenon and they shape our understanding of the economy as a whole, and thus they are a small part of a massive story. But what can be harder for me is to treat the history of national income statistics as a case of something else – for example, pitching my story as a case of how ideas and knowledge practices shape politics, comparable to Somers and Block’s work on Malthus and so on. In sum, two different approaches, two different kinds of generality.

Gelman’s Problems with P-Values

Andrew Gelman is a seemingly tireless crusader against the sloppy use of p-values. Today he posted a very short (4 page) new article that explains some of the problems with p-values, and gives some quick examples of when they fall apart vs. when they merely do no harm. I recommend reading the whole thing, especially if you’ve recently been exposed to the standard two semester sequence of statistics in Sociology or econometrics. If you’re totally unfamiliar with Bayesian analysis, some of the terms will be a bit confusing, but it’s a good opportunity to search around a bit and get a feel for the language of Bayesianism. A couple gems:

The casual view of the P value as posterior probability of the truth of the null hypothesis is false and not even close to valid under any reasonable model, yet this misunderstanding persists even in high-stakes settings (as discussed, for example, by Greenland in 2011). The formal view of the P value as a probability conditional on the null is mathematically correct but typically irrelevant to research goals (hence, the popularity of alternative—if wrong—interpretations).

This passage, from the opening, names both the most common but wrong interpretation and identifies one source of that wrongness: what p-values actually mean is not very interesting, and so we’d much rather they mean what they don’t.

One big practical problem with P values is that they cannot easily be compared. … Consider a simple example of two independent experiments with estimates (standard error) of 25 (10) and 10 (10). The first experiment is highly statistically significant (two and a half standard errors away from zero, corresponding to a normal-theory P value of about 0.01) while the second is not significant at all. Most disturbingly here, the difference is 15 (14), which is not close to significant. The naive (and common) approach of summarizing an experiment by a P value and then contrasting results based on significance levels, fails here, in implicitly giving the imprimatur of statistical significance on a comparison that could easily be explained by chance alone.

Gelman has written about this example many times before under the heading “The difference between significant and not significant is not significant.” This is the quickest explanation I’ve seen.