Somehow I’ve never written about Dave Donaldson’s incredible Indian railroad paper before; as it has a fair claim on being the best job market paper in the past few years, it’s time to rectify that. I believe Donaldson spent eight years as LSE working on his PhD, largely made up of this paper. And that time led to a well-received result: in addition to conferences, a note on the title page mentions that the paper has been presented at Berkeley, BU, Brown, Chicago, Harvard, the IMF, LSE, MIT, the Minneapolis Fed, Northwestern, Nottingham, NYU, Oxford, Penn, Penn State, the Philly Fed, Princeton, Stanford, Toronto, Toulouse, UCL, UCLA, Warwick, the World Bank and Yale! So we can safely say, this is careful and well-vetted work.
Donaldson’s study considers the importance of infrastructure to development; it is, in many ways, the opposite of the “small changes”, RCT-based development literature that was particularly en vogue in the 2000s. Intuitively, we all think infrastructure is important, both for improving total factor productivity and for improving market access. The World Bank, for instance, spends 20 percent of its funds on infrastructure, more than “education, health, and social services combined.” But how important is infrastructure spending anyway? That’s a pretty hard question to define, let alone answer.
So let’s go back to one of the great infrastructure projects in human history: the Indian railroad during the British Raj. The British built over 67,000 km of rail in a country with few navigable rivers. They also, luckily for the economist, were typically British in the enormous number of price, weather, and rail shipment statistics they collected. Problematically for the economist, these statistics tended to be hand-written in weathered documents hidden away in the back rooms of India’s bureaucratic state. Donaldson nonetheless collected almost 1.5 million individual pieces of data from these weathered tomes. Now, you might think, let’s just regress new rail access on average incomes, use some IV to make sure that rail lines weren’t endogenous, and be done with it. Not so fast! First, there’s no district-level income per capita data for India in the 1800s! And second, we can use some theory to really tease out why infrastructure matters.
Let’s use four steps. First, try to estimate how much rail access lowered trade costs per kilometer; if a good is made in only one region, then theory suggests that the trade cost between regions is just the price difference of that commodity across regions. Even if we had shipping receipts, this wouldn’t be sufficient; bandits, and spoilage, and all the rest of Samuelson’s famous “iceberg” raise trade costs as well. Second, check whether lowered trade costs actually increased trade volume, and at what elasticity, using rainfall as a proxy for local productivity shocks. Third, note that even though we don’t have income, theory tells us that for agricultural workers, percentage changes in total production per unit of land deflated by a local price index is equivalent to percentage changes in real income per unit of land. Therefore, we can check in a reduced form way whether new rail access increases real incomes, though we can’t say why. Fourth, in Donaldson’s theoretical model (an extension, more or less, of Eaton and Kortum’s Ricardian model), trade costs and differences in region sizes and productivity shocks in all regions all interact to affect local incomes, but they all act through a sufficient statistic: the share of consumption that consists of local products. That is, if we do our regression testing for the impact of rail access on real income changes, but control for changes in the share of consumption from within the district, we should see no effect from rail access.
Now, these stages are tough. Donaldson constructs a network of rail, road and river routes using 19th century sources linked on GIS, and traces out the least-cost paths from any one district to another. He then non-linearly estimates the relative cost per kilometer of rail, sea, river and road transport using the prices of eight types of salt, each of which were sold across British India but only produced in a single location. He then finds that lowered trade costs do appear to raise trade volumes with quite high elasticity. The reduced form regression suggests that access to the Indian railway increased local incomes by an average of 16 percent (Indian real incomes per capita increased only 22 percent during the entire period 1870 to 1930, so 16 percent locally is substantial). Using the “trade share” sufficient statistic described above, Donaldson shows that almost all of that increase was due to lowered trade costs rather than internal migration or other effects. Wonderful.
This paper is a great exercise in the value of theory for empiricists. Theory is meant to be used, not tested. Here, fairly high-level trade theory – literally the cutting edge – was deployed to coax an answer to a super important question even though atheoretical data could have provided us nothing (remember, there isn’t even any data on income per capita to use!). The same theory also allowed to explain the effect, rather than just state it, a feat far more interesting to those who care about external validity. Two more exercises would be nice, though; first, and Donaldson notes this in the conclusion, trade can also improve welfare by lowering volatility of income, particularly in agricultural areas. Is this so in the Indian data? Second, rail, like lots of infrastructure, is a network – what did the time trend in income effects look like?
September 2012 Working Paper (IDEAS version). No surprise, Donaldson’s website mentions this is forthcoming in the AER. (There is a bit of a mystery – Donaldson was on the market with this paper over four years ago. If we need four years to get even a paper of this quality through the review process, something has surely gone wrong with the review process in our field.)
Reblogged this on Saad Gulzar and commented:
As someone who has spent time trying to retrieve railway data in Pakistan, I really appreciate this effort!
“Theory is meant to be used, not tested.”
That is a hilarious statement.
*Assumptions* are meant to be used. No point developing a theory if all you’re going to do is assume whatever you want.