Essentially every economist reports Huber-White robust standard errors, rather than traditional standard errors, in their work these days, and for good reason: heteroskedasticity, or heterogeneity in error variance across observations, can lead to incorrect standard error calculations. Generally, robust standard errors are only used to ensure that the parameter of interest is “real” and not an artifact of random statistical variation; the value of the parameter itself is unbiased under heteroskedasticity in many models as long as the model itself is correctly specified. For example, if data is generated by the linear process y=Xb+e, then the estimated parameter b is unbiased even if the OLS assumption that e is homoskedastic is violated. Many researchers just tag a “,robust” onto their Stata code and hope this inoculates them from criticism about the validity of their statistical inference.

King and Roberts point out, using three very convincing examples from published papers, that robust standard errors have another much more important use. If robust errors and traditional errors are very different, then researchers ought try to figure out what is causing the heteroskedasticity in their data since, in general, tests like Breusch-Pagan or White’s Test cannot distinguish between model misspecification and fundamental heteroskedasticity. Heteroskedasticity is common in improperly specified models, e.g., by estimating OLS when the data is truncated at zero.

Nothing here should be surprising. If you are a structural economist, then surely you find the idea of estimating any function other than the precise form suggested by theory (which is rarely OLS) to be quite strange; why would anyone estimate any function other than the one directly suggested by the model, where indeed the model gives you the overall variance structure? But King and Roberts show that such advise is not often heeded.

They first look at a paper in a top International Relations jouranl, which suggested that small countries receive excess foreign aid (which seems believable at first glance; I spent some time in East Timor a few years ago, a tiny country which seemed to have five IGO workers for every resident). The robust and traditional standard errors diverged enormously. Foreign aid flow amounts are super skewed. Taking a Box-Cox transformation gets the data looking relatively normal again, and rerunning the estimation on the transformed data shows little difference between robust and traditional standard errors. In addition to fixing the heteroskedasticity, transforming the specified model flips the estimated parameter: small countries receive *less* foreign aid than other covariates might suggest.

King and Roberts then examine a top political science publication (on trade agreements and foreign aid), where again robust and traditional errors diverge. Some diagnostic work finds that a particular detrending technique assumed homogenous across countries fits much better if done heterogeneously across countries; otherwise, spurious variation over time is introduced. Changing the detrending method causes robust and traditional errors to converge again, and as in the small country aid paper above, the modified model specification completely flips the sign on the parameter of interest. A third example came from a paper using Poisson to estimate overdispersed (variance exceeds the mean) count data; replacing Poisson with the more general truncated negative binomial model again causes robust and traditional errors to converge, and again completely reverses the sign on the parameter of interest. Interesting. If you insist on estimating models that are not fully specified theoretically, then at least use the information that divergent robust standard errors give you about whether you model is sensible.

September 2012 Working Paper (No IDEAS version)

I cannot fault the technical argument, but I find the aid example extremely hard to swallow. Have you looked at the correlation, or a scatter plot, of aid per capita (or aid as a percentage of GDP) against country size (population)? Small countries receive much more aid than large ones.

That’s a raw correlation, perhaps adding those other covariates would change things – perhaps small countries receive less aid than large, once you control for other relevant factors like poverty. But this seems unlikely to me – if anything, smaller developing countries (mainly islands) tend to be better off than large ones – so it’s not as if small countries ought to get more aid because they are poorer.

India, China, Pakistan, Bangladesh and Nigeria are the big countries, and they receive tiny quantities of aid per capita. Ethiopia and Congo are both large and receive large aid flows. But all the really aid reliant countries are tiny (Solomons, Burundi etc.).

Do you really believe these small countries are getting less aid than other covariates suggest, than larger ones?