Every economist knows what PPP adjustments are: we adjust consumption/GDP/whatever comparisons to account for differences in the price of nontradables and to remove the effect of economically insignificant swings in market exchange rates. But how exactly is this done? Is the data reliable? What precautions should be taken? Anyone who has seen how economic data is created – I’ve worked briefly at the Fed and at the Dept of Commerce – is rightfully worried: even simple statistics in a developed country like the US are often surprisingly inaccurate. In the new AEJ:Macro, Deaton and Huston explain what procedures were used in the recent 2005 International Comparison Project, which gathers the prices used in World Bank and PWT data; you may remember that China’s GDP was nearly halved as a result of this data.
First, we don’t even have “a” definition of PPP. GEKS (usually EKS, though Deaton and Huston think Gini should be credited for the idea as well) PPP ensures transitivity of bilateral price levels, and in a limited sense allows welfare comparisons if we assume identical preferences across any two countries, but do not allow GDP to be disaggregated into PPP-adjusted consumption, investment, etc. GK PPP does allow such aggregation, but in so doing overstates the value of nontraded goods in poor countries, therefore overstating living standards in poor countries; further, GK has no link to welfare theory.
Once an index has been selected, the data themselves are problematic. How do we account for different consumption bundles in different regions (the authors use Ethiopian teff and Thai rice as a bilateral problem)? First compute PPP within regions with similar product availability, then use a “ring” of countries with good data availability to link the regions. Even if price data is good, is the underlying GDP calculation in poor countries any good? Probably not. How do we account for services? This is generally problematic, though some “quality adjustments”, such as adjusting education for internationally comparable test scores is being done as of 2005. Are prices nationally representative, or are only urban areas samples? Prices are not representative in many countries, particularly China, where only 11 cities were sampled. How do we adjust for quality? Each good is very specifically described in terms of packaging and content, though this specificity leads to problems of data availability.
The number of problems are huge. Should we worry? In some sense, when claims like “variable x is important for growth based on this regression using PPP data, and y is not,” obviously the above data problems can be very important. But I think the “smell test” generally works: I travel heavily and generally find that when countries “feel richer”, they tend to be so under PPP income per capita comparisons, so there must be some value in the exercise. On the other hand, these types of data problems are a major reason I see my future work as lying in theory rather than empirics!
http://www.princeton.edu/~deaton/downloads/deaton_heston_complete_nov10.pdf (Final WP – published in AEJ: Macro 2.4)