(A ridiculous number of talented young game theorists are currently enjoying summer’s last gasp in the old Sicilian town of Erice for this year’s IMS Stochastic Methods in Game Theory conference. Over the next few days, I’ll be posting – from Chicago, not from the beaches of Italy, alas – about a few papers from that conference that struck me as particularly interesting. I can’t resist a good pun, so let’s start with Nicholas Klein’s expert testing paper…)
Classic mechanism design problems involved properly incentivizing agents to work when effort cannot be observed. A series of recent papers, the great concentration of which seem to be coming from here at Northwestern, extend mechanism design to the problem of expert testing: how do I know when an agent, who knows more about a topic than me, is accurately reporting the results of his experiments?
Klein considers the testing problem in the form of a three-armed bandit, with one safe arm (i.e., one arm where the agent shirks and does nothing), one “cheating” arm that achieves breakthrough with probability p, and one “research” arm that achieves breakthroughs with probability q should the principal’s hypothesis be true, and with probability 0 otherwise. Imagine that I own a pharma factory, and I think a certain chemical process will produce a useful new drug, but I don’t know how to do the experiments. If the scientist cheats and pours chemicals together in some way which is infeasible when we go to mass production, then he will surely create the compound I want with probability p. If he uses my hypothesized technique, and my hypothesis is true, he will create the drug with probability q, but if my hypothesis is not true, he will never create the drug. The agent has until time T to experiment as he will, with the principal observing – at any given time t – any success, not knowing whether it was achieved by cheating or research. At time T, the principal receives a payoff if the research arm provided a success, so the principal wants the agent to experiment with the research arm exclusively until a success is achieved, and then to play the safe arm, for which the agent will be paid zero. There is a clear incentive problem, though. If the agent is off equilibrium, he may have a belief that the research arm is very, very unlikely to succeed: indeed, because of that belief, he may feel a success is more probable with the cheating arm than with the research arm. For agents with such beliefs, if the principal pays upon seeing a success, the agent will be paid more than zero, but the principal will make zero. Is there a better incentive system?
There is. Note that, since the problem is continuous, if the agent plays “research”, his belief about the probability the hypothesis is true will converge to the true probability. Assume that, given perfect monitoring (i.e., no possibility of cheating, but still a probability of shirking, or playing the safe arm) and such a true probability, it is worthwhile for the principal to pay the agent to experiment. Because p>q, there must be a length of time T, and an integer m, such that if the agent is only paid after m successes, rather than after 1 success, the agent will only play the research arm since it will be more likely to reach m than the cheating arm given that the agent has true beliefs about the truth of the hypothesis. Klein shows that optimal payments only pay agents after m>1 successes. Once we ensure that agents do not cheat, we can ensure they experiment optimally (in the sense of a traditional bandit problem) by letting the agent payment be conditional on the time of the second success, which will only come from research and not cheating, with such payment varying to exactly take into account the value of information from experimentation. Clearly the final payment must be higher than the first-best payment where there are no agency problems (and indeed, it must be higher than the second-best payment where only shirking is a concern).
It is not stated in the paper, but it seems to me that if p is not greater than q, it does not appear that any incentive compatible scheme will elicit optimal experimentation by the agent. Perhaps we should have a General Theorem of Expert Testing: that experts can rarely be incentivized to tell the truth.
http://www.gtcenter.org/Downloads/Conf/Klein1044.pdf (Klein notes that this draft, from April, is incomplete and preliminary: I do not have a newer version)