## “At Least Do No Harm: The Use of Scarce Data,” A. Sandroni (2014)

This paper by Alvaro Sandroni in the new issue of AEJ:Micro is only four pages long, and has only one theorem whose proof is completely straightforward. Nonetheless, you might find it surprising if you don’t know the literature on expert testing.

Here’s the problem. I have some belief p about which events (perhaps only one, perhaps many) will occur in the future, but this belief is relatively uninformed. You come up to me and say, hey, I actually *know* the distribution, and it is p*. How should I incentivize you to truthfully reveal your knowledge? This step is actually an old one: all we need is something called a proper scoring rule, the Brier Score being the most famous. If someone makes N predictions f(i) about the probability of binary events i occurring, then the Brier Score is the sum of the squared difference between each prediction and its outcome {0,1}, divided by N. So, for example, if there are three events, you say all three will independently happen with p=.5, and the actual outcomes are {0,1,0}, your score is 1/3*[(.5-1)^2+2*(.5-0)^2], or .25. The Brier Score being a proper scoring rule means that your expected score is lowest if you actually predict the true probability distribution. That being the case, all I need to do is pay you more the lower your Brier Score is, and if you are risk-neutral you, being the expert, will truthfully reveal your knowledge. There are more complicated scoring rules that can handle general non-binary outcomes, of course. (If you don’t know what a scoring rule is, it might be worthwhile to convince yourself why a rule equal to the summed absolute value of deviations between prediction and outcome is not proper.)

That’s all well and good, but a literature over the past decade or so called “expert testing” has dealt with the more general problem of knowing who is actually an expert at all. It turns out that it is incredibly challenging to screen experts from charlatans when it comes to probabilistic forecasts. The basic (too basic, I’m afraid) reason is that your screening rule can only condition on realizations, but the expert is expected to know a much more complicated object, the probability distributions of each event. Imagine you want to use the following rule, called calibration, to test weathermen: on days where rain was predicted p=.4, it actually does rain close to 40 percent of those days. A charlatan has no idea whether it will rain today or tomorrow, but after making a year of predictions, notices that most of his predictions are “too low”. When rain was predicted with .6, it rained 80 percent of the time, and when predicted with .7, it rained 72 percent of the time, etc. What should the charlatan do? Start predicting rain every day, to become “better calibrated”. As the number of days grows large, this trick gets the charlatan closer and closer to calibration.

But, you say, surely I can notice such an obviously tricky strategy. That implicitly means you want to use a more complicated test to screen the charlatans from the experts. And a famous result of Foster and Vohra (which apparently was very hard to publish because so many referees simply didn’t believe the proof!) says that any test which passes experts with high probability for any realization of nature as the number of predictions gets large can be passed by a suitably clever and strategic charlatan with high probability. And, indeed, the proof of this turns out to be a straightforward application of an abstract minimax theorem proven by Fan in the early 1950s.

Back, now, to the original problem of this post. If I know you are an expert, I can get your information with a payment that is maximized when a proper scoring rule is minimized. But what if, in addition to wanting info when it is good, I don’t want to be harmed when you are a charlatan? And further, what if only a single prediction is being made? The expert testing results mean that screening good from bad is going to be a challenge no matter how much data I have. If you are a charlatan and are always incentivized to report my prior, then I am not hurt. But if you actually know the true probabilities, I want to pay you according to a proper scoring rule. Try this payment scheme: if you predict my prior p, then you get a payment ε which does not depend on the realization of the data. If you predict anything else, you get an expected payment based on a proper scoring rule, and that expected payment is greater than ε. So the informed expert is incentivized to report truthfully (there is a straightforward modification of the above if the informed expert is not risk-neutral). How can we get the charlatan to always report p? If the charlatan has minmax preferences as in Gilboa-Schmeidler, then the payoff is ε if p is reported no matter how the data realizes. If, however, the probability distribution actually is p, and the charlatan ever reports anything other than p, then since payoffs are based on a proper scoring rule, in that “worst-case scenario” the charlatan’s expected payoff is less than ε, hence she will never report anything other than p due to the minmax preferences. I wouldn’t worry too much about the minmax assumption, since it makes quite a bit of sense as a utility function for a charlatan that must make a decision what to announce under a complete veil of ignorance about nature’s true distribution.

Final AEJ:Micro version, which is unfortunately behind a paywall (IDEAS page). I can’t find an ungated version of this article. It remains a mystery why the AEA is still gating articles in the AEJ journals. This is especially true of AEJ:Micro, a society-run journal whose main competitor, Theoretical Economics, is completely open access.