Category Archives: Macroeconomics

Yuliy Sannikov and the Continuous Time Approach to Dynamic Contracting

The John Bates Clark Award, given to the best economist in the United States under 40, was given to Princeton’s Yuliy Sannikov today. The JBC has, in recent years, been tilted quite heavily toward applied empirical microeconomics, but the prize for Sannikov breaks that streak in striking fashion. Sannikov, it can be fairly said, is a mathematical genius and a high theorist of the first order. He is one of a very small number of people to win three gold medals at the International Math Olympiad – perhaps only Gabriel Carroll, another excellent young theorist, has an equally impressive mathematical background in his youth. Sannikov’s most famous work is in the pure theory of dynamic contracting, which I will spend most of this post discussing, but the methods he has developed turn out to have interesting uses in corporate finance and in macroeconomic models that wish to incorporate a financial sector without using linearization techniques that rob such models of much of their richness. A quick warning: Sannikov’s work is not for the faint of heart, and certainly not for those scared of an equation or two. Economists – and I count myself among this group – are generally scared of differential equations, as they don’t appear in most branches of economic theory (with exceptions, of course: Romer’s 1986 work on endogenous growth, the turnpike theorems, the theory of evolutionary games, etc.). As his work is incredibly technical, I will do my best to provide an overview of his basic technique and its uses without writing down a bunch of equations, but there really is no substitute for going to the mathematics itself if you find these ideas interesting.

The idea of dynamic contracting is an old one. Assume that a risk-neutral principal can commit to a contract that pays an agent on the basis of observed output, with that output being generated this year, next year, and so on. A risk-averse agent takes an unobservable action in every period, which affects output subject to some uncertainty. Payoffs in the future are discounted. Take the simplest possible case: there are two periods, an agent can either work hard or not, output is either 1 or 0, and the probability it is 1 is higher if the agent works hard than otherwise. The first big idea in the dynamic moral hazard of the late 1970s and early 1980s (in particular, Rogerson 1985 Econometrica, Lambert 1983 Bell J. Econ, Lazear and Moore 1984 QJE) is that the optimal contract will condition period 2 payoffs on whether there was a good or bad outcome in period 1; that is, payoffs are history-dependent. The idea is that you can use payoffs in period 2 to induce effort in period 1 (because continuation value increases) and in period 2 (because there is a gap between the payment following good or bad outcomes in that period), getting more bang for your buck. Get your employee to work hard today by dangling a chance at a big promotion opportunity tomorrow, then actually give them the promotion if they work hard tomorrow.

The second big result is that dynamic moral hazard (caveat: at least in cases where saving isn’t possible) isn’t such a problem. In a one-shot moral hazard problem, there is a tradeoff between risk aversion and high powered incentives. I either give you a big bonus when things go well and none if things go poorly (in which case you are induced to work hard, but may be unhappy because much of the bonus is based on things you can’t control), or I give you a fixed salary and hence you have no incentive to work hard. The reason this tradeoff disappears in a dynamic context is that when the agent takes actions over and over and over again, the principle can, using a Law of Large Numbers type argument, figure out exactly the frequency at which the agent has been slacking off. Further, when the agent isn’t slacking off, the uncertainty in output each period is just i.i.d., hence the principal can smooth out the agent’s bad luck, and hence as the discount rate goes to zero there is no tradeoff between providing incentives and the agent’s dislike of risk. Both of these results will hold even in infinite period models, where we just need to realize that all the agent cares about is her expected continuation value following every action, and hence we can analyze infinitely long problems in a very similar way to two period problems (Spear and Srivistava 1987).

Sannikov revisited this literature by solving for optimal or near-to-optimal contracts when agents take actions in continuous rather than discrete time. Note that the older literature generally used dynamic programming arguments and took the discount rate to a limit of zero in order to get interested results. These dynamic programs generally were solved using approximations that formed linear programs, and hence precise intuition of why the model was generating particular results in particular circumstances wasn’t obvious. Comparative statics in particular were tough – I can tell you whether an efficient contract exists, but it is tough to know how that efficient contract changes as the environment changes. Further, situations where discounting is positive are surely of independent interest – workers generally get performance reviews every year, contractors generally do not renegotiate continuously, etc. Sannikov wrote a model where an agent takes actions that control the mean of output continuously over time with Brownian motion drift (a nice analogue of the agent taking an action that each period generates some output that depends on the action and some random term). The agent has the usual decreasing marginal utility of income, so as the agent gets richer over time, it becomes tougher to incentivize the agent with a few extra bucks of payment.

Solving for the optimal contract essentially involves solving two embedded dynamic optimization problems. The agent optimizes effort over time given the contract the principal committed to, and hence the agent chooses an optimal dynamic history-dependent contract given what the agent will do in response. The space of possible history-dependent contracts is enormous. Sannikov shows that you can massively simplify, and solve analytically, for the optimal contract using a four step argument.

First, as in the discrete time approach, we can simplify things by noting that the agent only cares about their continuous-time continuation value following every action they make. The continuation value turns out to be a martingale (conditioning on history, my expectation of the continuation value tomorrow is just my continuation value today), and is basically just a ledger of my promises that I have made to the agent in the future on the basis of what happened in the past. Therefore, to solve for the optimal contract, I should just solve for the optimal stochastic process that determines the continuation value over time. The Martingale Representation Theorem tells me exactly and uniquely what that stochastic process must look like, under the constraint that the continuation value accurately “tracks” past promises. This stochastic process turns out to have a particular analytic form with natural properties (e.g., if you pay flow utility today, you can pay less tomorrow) that depend on the actions the agents take. Second, plug the agent’s incentive compatibility constraint into our equation for the stochastic process that determines the continuation value over time. Third, we just maximize profits for the principal given the stochastic process determining continuation payoffs that must be given to the agent. The principal’s problem determines an HJB equation which can be solved using Ito’s rule plus some effort checking boundary conditions – I’m afraid these details are far too complex for a blog post. But the basic idea is that we wind up with an analytic expression for the optimal way to control the agent’s continuation value over time, and we can throw all sorts of comparative statics right at that equation.

What does this method give us? Because the continuation value and the flow payoffs can be constructed analytically even for positive discount rates, we can actually answer questions like: should you use long-term incentives (continuation value) or short-term incentives (flow payoffs) more when, e.g., your workers have a good outside option? What happens as the discount rate increases? What happens if the uncertainty in the mapping between the agent’s actions and output increases? Answering questions of these types is very challenging, if not impossible, in a discrete time setting.

Though I’ve presented the basic Sannikov method in terms of incentives for workers, dynamic moral hazard – that certain unobservable actions control prices, or output, or other economic parameters, and hence how various institutions or contracts affect those unobservable actions – is a widespread problem. Brunnermeier and Sannikov have a nice recent AER which builds on the intuition of Kiyotaki-Moore models of the macroeconomy with financial acceleration. The essential idea is that small shocks in the financial sector may cause bigger real economy shocks due to deleveraging. Brunnermeier and Sannikov use the continuous-time approach to show important nonlinearities: minor financial shocks don’t do very much since investors and firms rely on their existing wealth, but major shocks off the steady state require capital sales which further depress asset prices and lead to further fire sales. A particularly interesting result is that exogenous risk is low – the economy isn’t very volatile – then there isn’t much precautionary savings, and so a shock that hits the economy will cause major harmful deleveraging and hence endogenous risk. That is, the very calmness of the world economy since 1983 may have made the eventual recession in 2008 worse due to endogenous choices of cash versus asset holdings. Further, capital requirements may actually be harmful if they aren’t reduced following shocks, since those very capital requirements will force banks to deleverage, accelerating the downturn started by the shock.

Sannikov’s entire oeuvre is essentially a graduate course in a new technique, so if you find the results described above interesting, it is worth digging deep into his CV. He is a great choice for the Clark medal, particularly given the deep and rigorous application he has applied his theory to in recent years. There really is no simple version of his results, but his 2012 survey, his recent working paper on moral hazard in labor contracts, and his dissertation work published in Econometrica in 2007 are most relevant. In related work, we’ve previously discussed on this site David Rahman’s model of collusion with continuous-time information flow, a problem very much related to work by Sannikov and his coauthor Andrzej Skrzypacz, as well as Aislinn Bohren’s model of reputation which is related to the single longest theory paper I’ve ever seen, Sannikov and Feingold’s Econometrica on the possibility of “fooling people” by pretending to be a type that you are not. I also like that this year’s JBC makes me look like a good prognosticator: Sannikov is one of a handful of names I’d listed as particularly deserving just two years ago when Gentzkow won!

“Ranking Firms Using Revealed Preference,” I. Sorkin (2015)

Roughly 20 percent of earnings inequality is not driven by your personal characteristics or the type of job you work at, but by the precise firm you work for. This is odd. In a traditional neoclassical labor market, every firm should offer to same wage to workers with the same marginal productivity. If a firm doesn’t do so, surely their workers will quit and go to firms that pay better. One explanation is that since search frictions make it hard to immediately replace workers, firms with market power will wind up sharing rents with their employees. It is costly to search for jobs, but as your career advances, you try to move “up the job ladder” from positions that pay just your marginal product to positions that pay a premium: eventually you wind up as the city bus driver with the six figure contract and once there you don’t leave. But is this all that is going on?

Isaac Sorkin, a job market candidate from Michigan, correctly notes that workers care about the utility their job offers, not the wage. Some jobs stink even though they pay well: 80 hour weeks, high pressure bosses, frequent business travel to the middle of nowhere, low levels of autonomy, etc. We can’t observe the utility a job offers, of course, but this is a problem that always comes up in demand analysis. If a Chipotle burrito and a kale salad cost the same, but you buy the burrito, then you have revealed that you get more utility from the former; this is the old theory of revealed preference. Even though we rarely observe a single person choosing from a set of job offers, we do observe worker flows between firms. If we can isolate workers who leave their existing job for individual reasons, as distinct from those who leave because their entire firm suffers a negative shock, then their new job is “revealed” better. Intuitively, we see a lot of lawyers quit to run a bed and breakfast in Vermont, but basically zero lawyers quitting to take a mining job that pays the same as running a B&B, hence the B&B must be a “better job” than mining, and further if we don’t see any B&B owners quitting to become lawyers, the B&B must be a “better job” than corporate law even if the pay is lower.

A sensible idea, then: the same worker may be paid different amounts in relation to marginal productivity either because they have moved up the job ladder and luckily landed at a firm with market power and hence pay above marginal product (a “good job”), or because different jobs offer different compensating differentials (in which case high paying jobs may actually be “bad jobs” with long hours and terrible work environments). To separate the two rationales, we need to identify the relative attractiveness of jobs, for which revealed preference should work. The problem in practice is both figuring out which workers are leaving for individual reasons, and getting around the problem that it is unusual to observe in the data a nonzero number of people going from firm A to firm B and vice versa.

Sorkin solves these difficulties in a very clever way. Would you believe the secret is to draw on the good old Perron-Frebonius theorem, a trusted tool of microeconomists interested in network structure? How could that be? Workers meet firms in a search process, with firms posting offers in terms of a utility bundle of wages plus amenities. Each worker also has idiosyncratic tastes about things like where to live, how they like the boss, and so on. The number of folks that move voluntarily from job A to job B depends on how big firm A is (bigger firms have more workers that might leave), how frequently A has no negative productivity shocks (in which case moves are voluntary), and the probability a worker from A is offered a job at B when matched and accepts it, which depends on the relative utilities of the two jobs including the individual idiosyncratic portion. An assumption about the distribution of idiosyncratic utility across jobs allows Sorkin to translate probabilities of accepting a job into relative utilities.

What is particularly nice is that the model gives a linear restriction on any two job pairs: the relative probability of moving from A to B instead of B to A depends on the relative utility (abstracting from idiosyncratic portions) adjusted for firm size and offer probability. That is, if M(A,B) is the number of moves from A to B, and V(A) is a (defined in the paper) function of the non-idiosyncratic utility of job A, then

M(A,B)/M(B,A) = V(B)/V(A)

and hence

M(A,B)V(A) = M(B,A)V(B)

Taking this to data is still problematic because we need to restrict to job changes that are not just “my factory went out of business”, and because M(A,B) or M(B,A) are zero for many firm pairs. The first problem is solved by estimating the probability a given job switch is voluntary using the fact that layoff probability is related to the size and growth rate of a firm. The second problem can be solved by noting that if we sum the previous equation over all firms B not equal to A, we have

sum(B!=A)M(A,B)*V(A) = sum(B!=A)M(B,A)*V(B)


V(A) = sum(B!=A)M(B,A)*V(B)/sum(B!=A)M(A,B)

The numerator is the number of hires A makes weighted for the non-idiosyncratic utility of firms the hires come from, and the denominator is the number of people that leave firm A. There is one such linear restriction per firm, but the utility of firm A depends on the utility of all firms. How to avoid this circularity? Write the linear restrictions in matrix form, and use the Perron-Frebonius theorem to see that the relative values of V are determined by a particular eigenvector as long as the matrix of moves is strongly connected! Strongly connected just means that there is at least one chain of moves between employers that can get me from firm A to B and vice versa, for all firm pairs!. All that’s left to do now is to take this to the data (not a trivial computation task, since there are so many firms in the US data that calculating eigenvectors will require some numerical techniques).

So what do we learn? Industries like education offer high utility compared to pay, and industries like mining offer the opposite, as you’d expect. Many low paying jobs offer relatively high nonpay utility, and many female-dominated sectors do as well, implying the measured earnings inequality and gender gaps may be overstating the true extent of utility inequality. That is, a teacher making half what a miner makes is partly reflective of the fact that mining is a job that requires compensating differentials to make up for long hours in the dark and dangerous mine shaft. Further, roughly two thirds of the earnings inequality related to firms seems to be reflecting compensating differentials, and since just over 20% of earnings inequality in the US is firm related, this means that about 15% of earnings inequality is just reflecting the differential perceived quality of jobs. This is a surprising result, and it appears to be driven by differences in job amenities that are not easy to measure. Goldman Sachs is a “good job” despite relatively low pay compared to other finance firms because they offer good training and connections. This type of amenity is hard to observe, but Sorkin’s theoretical approach based on revealed preference allows the econometrician to “see” these types of differences across jobs, and hence to more properly understand which jobs are desirable. This is another great example of a question – how does the quality of jobs differ and what does that say about the nature of earnings inequality – that is fundamentally unanswerable by methodological techniques that are unwilling to inject some theoretical assumptions into the analysis.

November 2015 Working Paper. Sorkin has done some intriguing work using historical data on the minimum wage as well. Essentially, minimum wage changes that are not indexed to inflation are only temporary in real terms, so if it costly to switch from labor to machines, you might not do so in response to a “temporary” minimum wage shock. But a permanent increase does appear to cause long run shifts away from labor, something Sorkin sees in industries from apparel in the early 20th century to fast food restaurants. Simon J├Ąger, a job candidate from Harvard, also has an interesting purely empirical paper about friction in the labor market, taking advantage of early deaths of German workers. When these deaths happen, working in similar roles at the firm see higher wages and lower separation probability for many years, whereas other coworkers see lower wages, with particularly large effects when the dead worker has unusual skills. All quite intuitive from a search model theory of labor, where workers are partial substitutes for folks with the same skills, but complements for folks with firm-specific capital but dissimilar skills. Add these papers to the evidence that efficiency in the search-and-matching process of labor to firms is a first order policy problem.

“Designing Efficient College and Tax Policies,” S. Findeisen & D. Sachs (2014)

It’s job market season, which is a great time of year for economists because we get to read lots of interesting papers. The one, by Dominik Sachs from Cologne and his coauthor Sebastian Findeisen, is particularly relevant given the recent Obama policy announcement about further subsidizing community college. The basic facts of marginal college students are fairly well-known: there is a pretty substantial wage bump for college grads (including ones who are not currently attending but who would attend if college was a little cheaper), many do not go to college even given this wage bump, there are probably externalities both in the economic and social realm from having a more education population though these are quite hard to measure, borrowing constraints bind for some potential college students but don’t appear to be that important, and it is very hard to design policies which benefit only marginal college candidates without also subsidizing those who would go whether or not the subsidy existed.

The naive thought might be “why should we subsidize college in the absence of borrowing constraints? By revealed preference, people choose not to go to college even given the wage bump, which likely implies that for many people studying and spending time going to class gives negative utility. Given the wage bump, these people are apparently willing to pay a lot of money to avoid spending time in college. The social externalities of college probably exist, but in general equilibrium more college-educated workers might drive down the return to college for people who are currently going. Therefore, we ought not distort the market.”

However, Sachs and Findeisen point out that there is also a fiscal externality: higher wages equals higher tax revenue in the future, and only the government cares about that revenue. Even more, the government is risk-neutral, or at least less risk-averse than individuals, about that revenue; people might avoid going to college if, along with bumping up their expected future wages, college also introduces uncertainty into their future wage path. If a subsidy could be targeted largely to students on the margin rather than those currently attending college, and if those marginal students see a big wage bump, and if government revenue less transfers back to the taxpayer is high, then it may be worth it for the government to subsidize college even if there are no other social benefits!

The authors write a nice little structural model. People choose to go to college or not depending on their innate ability, their parent’s wealth, the cost of college, the wage bump they expect (and the variance thereof), and their personal taste or distaste for studying as opposed to working (“psychic costs”). All of those variables aside from personal taste and innate ability can be pulled out of U.S. longitudinal data, performance on the army qualifying test can proxy for innate ability, and given distributional assumptions, we can identify the last free parameter, personal taste, by assuming that people go to college only if their lifetime discounted utility from attendance, less psychic costs, exceeds the lifetime utility from working instead. A choice model of this type seems to match data from previous studies with quasirandom variation concerning the returns to college education.

The direct benefit to the government from higher tax revenue from a subsidy policy, then, is the cost of the subsidy times the number subsidized, minus the proportion of subsidized students who would not have gone to college but for the subsidy times the discounted lifetime wage bump for those students times government tax revenue as a percent of that wage bump. The authors find that a general college subsidy program nearly pays for itself: if you subsidize everyone there aren’t many marginal students, but even for those students the wage bump is substantial. Targeting low income students is even better. Though the low income students affected on the margin tend to be less academically gifted, and hence to earn a lower (absolute) increase in wages from going to college, subsidies targeted at low income students do not waste as much money subsidizing students who would go to college anyway (i.e., a large percentage of high income kids). Note that the subsidies are small enough in absolute terms that the distortion on parental labor supply, from working less in order to qualify for subsidies, is of no quantitative importance, a fact the authors show rigorously. Merit-based subsidies will attract better students who have more to gain from going to college, but they also largely affect people who would go to college anyway, hence offer less bang for the buck to government compared to need-based grants.

The authors have a nice calibrated model in hand, so there are many more questions they ask beyond the direct partial equilibrium benefits of college attendance. For example, in general equilibrium, if we induce people to go to college, the college wage premium will fall. But note that wages for non-college-grads will rise in relative terms, so the net effect of the grants discussed in the previous paragraph on government revenue is essentially unchanged. Further, as Nate Hilger found using quasirandom variation in income due to layoffs, liquidity constraints do not appear to be terribly important for the college making decision: it is increasing grants, not changing loan eligibility, that will do anything of any importance to college attendance.

November 2014 working paper (No IDEAS version). The authors have a handful of other very interesting papers in the New Dynamic Public Finance framework, which is blazing hot right now. As far as I understand the project of NDPF, essentially we can simplify the (technically all-but-impossible-to-solve) dynamic mechanism problem of designing optimal taxes and subsidies under risk aversion and savings behavior to an equivalent reduced form that essentially only depends on simple first order conditions and a handful of elasticities. Famously, it is not obvious that capital taxation should be zero.

Labor Unions and the Rust Belt

I’ve got two nice papers for you today, both exploring a really vexing question: why is it that union-heavy regions of the US have fared so disastrously over the past few decades? In principle, it shouldn’t matter: absent any frictions, a rational union and a profit-maximizing employer ought both desire to take whatever actions generate the most total surplus for the firm, with union power simply affecting how those rents are shared between management, labor and owners. Nonetheless, we notice empirically a couple of particularly odd facts. First, especially in the US, union-dominated firms tend to limit adoption of new, productivity-enhancing technology; the late adoption of the radial tire among U.S. firms is a nice example. Second, unions often negotiate not only about wages but about “work rules”, insisting upon conditions like inflexible employee roles. A great example here is a California longshoremen contract which insisted upon a crew whose sole job was to stand and watch while another crew did the job. Note that preference for leisure can’t explain this, since surely taking that leisure at home rather than standing around the worksite would be preferable for the employees!

What, then, might drive unions to push so hard for seemingly “irrational” contract terms, and how might union bargaining power under various informational frictions or limited commitment affect the dynamic productivity of firms? “Competition, Work Rules and Productivity” by the BEA’s Benjamin Bridgman discusses the first issue, and a new NBER working paper, “Competitive Pressure and the Decline of the Rust Belt: A Macroeconomic Analysis” by Alder, Lagakos and Ohanian covers the second; let’s examine these in turn.

First, work rules. Let a union care first about keeping all members employed, and about keeping wage as high as possible given full employment. Assume that the union cannot negotiate the price at which products are sold. Abstractly, work rules are most like a fixed cost that is a complete waste: no matter how much we produce, we have to incur some bureaucratic cost of guys standing around and the like. Firms will set marginal revenue equal to marginal cost when deciding how much to produce, and at what price that production should be sold. Why would the union like these wasteful costs?

Let firm output given n workers just be n-F, where n is the number of employees, and F is how many of them are essentially doing nothing because of work rules. The firm chooses price p and the number of employees n given demand D(p) and wage w to maximize p*D(p)-w*n, subject to total production being feasible D(p)=n-F. Note that, as long as total firm profits under optimal pricing exceed F, the firm stays in business and its pricing decision, letting marginal revenue equal marginal cost, is unaffected by F. That is, the optimal production quantity does not depend on F. However, the total amount of employment does depend on F, since to produce quantity D(p) you need to employ n-F workers. Hence there is a tradeoff if the union only negotiates wages: to employ more people, you need a lower wage, but using wasteful work rules, employment can be kept high even when wages are raised. Note also that F is limited by the total rents earned by the firm, since if work rules are particularly onerous, firms that are barely breaking even without work rules will simply shut down. Hence in more competitive industries (formally, when demand is less elastic), work rules are less likely to imposed by unions. Bridgman also notes that if firms can choose technology (output is An-F, where A is the level of technology), then unions will resist new technology unless they can impose more onerous work rules, since more productive technology lowers the number of employees needed to produce a given amount of output.

This is a nice result. Note that the work rule requirements have nothing to do with employees not wanting to work hard, since work rules in the above model are a pure waste and generate no additional leisure time for workers. Of course, this result really hinges on limiting what unions can bargain over: if they can select the level of output, or can impose the level of employment directly, or can permit lump-sum transfers from management to labor, then unionized firms will produce at the same productivity at non-unionized firms. Information frictions, among other worries, might be a reason why we don’t see these types of contracts at some unionized firms. With this caveat in mind, let’s turn to the experience of the Rust Belt.

The U.S. Rust Belt, roughly made up of states surrounding the Great Lakes, saw a precipitous decline from the 1950s to today. Alder et al present the following stylized facts: the share of manufacturing employment in the U.S. located in the Rust Belt fell from the 1950s to the mid-1980s, there was a large wage gap between Rust Belt and other U.S. manufacturing workers during this period, Rust Belt firms were less likely to adopt new innovations, and labor productivity growth in Rust Belt states was lower than the U.S. average. After the mid-1980s, Rust Belt manufacturing firms begin to look a lot more like manufacturing firms in the rest of the U.S.: the wage gap is essentially gone, the employment share stabilizes, strikes become much less common, and productivity growth is similar. What happened?

In a nice little model, the authors point out that output competition (do I have lots of market power?) and labor market bargaining power (are my workers powerful enough to extract a lot of my rents?) interact in an interesting way when firms invest in productivity-increasing technology and when unions cannot commit to avoid a hold-up problem by striking for a better deal after the technology investment cost is sunk. Without commitment, stronger unions will optimally bargain away some of the additional rents created by adopting an innovation, hence unions function as a type of tax on innovation. With sustained market power, firms have an ambiguous incentive to adopt new technology – on the one hand, they already have a lot of market power and hence better technology will not accrue too many more sales, but on the other hand, having market power in the future makes investments today more valuable. Calibrating the model with reasonable parameters for market power, union strength, and various elasticities, the authors find that roughly 2/3 of the decline in the Rust Belt’s manufacturing share can be explained by strong unions and little output market competition decreasing the incentive to invest in upgrading technology. After the 1980s, declining union power and more foreign competition limited both disincentives and the Rust Belt saw little further decline.

Note again that unions and firms rationally took actions that lowered the total surplus generated in their industry, and that if the union could have committed not to hold up the firm after an innovation was adopted, optimal technology adoption would have been restored. Alder et al cite some interesting quotes from union heads suggesting that the confrontational nature of U.S. management-union relations led to a belief that management figures out profits, and unions figure out to secure part of that profit for their members. Both papers discussed here show that this type of division, by limiting the nature of bargains which can be struck, can have calamitous effects for both workers and firms.

Bridgman’s latest working paper version is here (RePEc IDEAS page); the latest version of Adler, Lagakos and Ohanian is here (RePEc IDEAS). David Lagakos in particular has a very nice set of recent papers about why services and agriculture tend to have such low productivity, particularly in the developing world; despite his macro background, I think he might be a closet microeconomist!

“Wall Street and Silicon Valley: A Delicate Interaction,” G.-M. Angeletos, G. Lorenzoni & A. Pavan (2012)

The Keynesian Beauty Contest – is there any better example of an “old” concept in economics that, when read in its original form, is just screaming out for a modern analysis? You’ve got coordination problems, higher-order beliefs, signal extraction about underlying fundamentals, optimal policy response by a planner herself informationally constrained: all of these, of course, problems that have consumed micro theorists over the past few decades. The general problem of irrational exuberance when we start to model things formally, though, is that it turns out to be very difficult to generate “irrational” actions by rational, forward-looking agents. Angeletos et al have a very nice model that can generate irrational-looking asset price movements even when all agents are perfectly rational, based on the idea of information frictions between the real and financial sector.

Here is the basic plot. Entrepreneurs get an individual signal and a correlated signal about the “real” state of the economy (the correlation in error about fundamentals may be a reduced-form measure of previous herding, for instance). The entrepreneurs then make a costly investment. In the next period, some percentage of the entrepreneurs have to sell their asset on a competitive market. This may represent, say, idiosyncratic liquidity shocks, but really it is just in the model to abstract away from the finance sector learning about entrepreneur signals based on the extensive margin choice of whether to sell or not. The price paid for the asset depends on the financial sector’s beliefs about the real state of the economy, which come from a public noisy signal and the trader’s observations about how much investment was made by entrepreneurs. Note that the price traders pay is partially a function of trader beliefs about the state of the economy derived from the total investment made by entrepreneurs, and the total investment made is partially a function of the price at which entrepreneurs expect to be able to sell capital should a liquidity crisis hit a given firm. That is, higher order beliefs of both the traders and entrepreneurs about what the other aggregate class will do determine equilibrium investment and prices.

What does this imply? Capital investment is higher in the first stage if either the state of the world is believed to be good by entrepreneurs, or if the price paid in the following period for assets is expected to be high. Traders will pay a high price for an asset if the state of the world is believed to be good. These traders look at capital investment and essentially see another noisy signal about the state of the world. When an entrepreneur sees a correlated signal that is higher than his private signal, he increases investment due to a rational belief that the state of the world is better, but then increases it even more because of an endogenous strategic complementarity among the entrepreneurs, all of whom prefer higher investment by the class as a whole since that leads to more positive beliefs by traders and hence higher asset prices tomorrow. Of course, traders understand this effect, but a fixed point argument shows that even accounting for the aggregate strategic increase in investment when the correlated signal is high, aggregate capital can be read by traders precisely as a noisy signal of the actual state of the world. This means that when when entrepreneurs invest partially on the basis of a signal correlated among their class (i.e., there are information spillovers), investment is based too heavily on noise. An overweighting of public signals in a type of coordination game is right along the lines of the lesson in Morris and Shin (2002). Note that the individual signals for entrepreneurs are necessary to keep the traders from being able to completely invert the information contained in capital production.

What can a planner who doesn’t observe these signals do? Consider taxing investment as a function of asset prices, where high taxes appear when the market gets particularly frothy. This is good on the one hand: entrepreneurs build too much capital following a high correlated signal because other entrepreneurs will be doing the same and therefore traders will infer the state of the world is high and pay high prices for the asset. Taxing high asset prices lowers the incentive for entrepreneurs to shade capital production up when the correlated signal is good. But this tax will also lower the incentive to produce more capital when the actual state of the world, and not just the correlated signal, is good. The authors discuss how taxing capital and the financial sector separately can help alleviate that concern.

Proving all of this formally, it should be noted, is quite a challenge. And the formality is really a blessing, because we can see what is necessary and what is not if a beauty contest story is to explain excess aggregate volatility. First, we require some correlation in signals in the real sector to get the Morris-Shin effect operating. Second, we do not require the correlation to be on a signal about the real world; it could instead be correlation about a higher order belief held by the financial sector! The correlation merely allows entrepreneurs to figure something out about how much capital they as a class will produce, and hence about what traders in the next period will infer about the state of the world from that aggregate capital production. Instead of a signal that correlates entrepreneur beliefs about the state of the world, then, we could have a correlated signal about higher-order beliefs, say, how traders will interpret how entrepreneurs interpret how traders interpret capital production. The basic mechanism will remain: traders essentially read from aggregate actions of entrepreneurs a noisy signal about the true state of the world. And all this beauty contest logic holds in an otherwise perfectly standard Neokeynesian rational expectations model!

2012 working paper (IDEAS version). This paper used to go by the title “Beauty Contests and Irrational Exuberance”; I prefer the old name!

Dale Mortensen as Micro Theorist

Northwestern’s sole Nobel Laureate in economics, Dale Mortensen, passed overnight; he remained active as a teacher and researcher over the past few years, though I’d be hearing word through the grapevine about his declining health over the past few months. Surely everyone knows Mortensen the macroeconomist for his work on search models in the labor market. There is something odd here, though: Northwestern has really never been known as a hotbed of labor research. To the extent that researchers rely on their coworkers to generate and work through ideas, how exactly did Mortensen became such a productive and influential researcher?

Here’s an interpretation: Mortensen’s critical contribution to economics is as the vector by which important ideas in micro theory entered real world macro; his first well-known paper is literally published in a 1970 book called “Microeconomic Foundations of Employment and Inflation Theory.” Mortensen had the good fortune to be a labor economist working in the 1970s and 1980s at a school with a frankly incredible collection of microeconomic theorists; during those two decades, Myerson, Milgrom, Loury, Schwartz, Kamien, Judd, Matt Jackson, Kalai, Wolinsky, Satterthwaite, Reinganum and many others were associated with Northwestern. And this was a rare condition! Game theory is everywhere today, and pioneers in that field (von Neumann, Nash, Blackwell, etc.) were active in the middle of the century. Nonetheless, by the late 1970s, game theory in the social sciences was close to dead. Paul Samuelson, the great theorist, wrote essentially nothing using game theory between the early 1950s and the 1990s. Quickly scanning the American Economic Review from 1970-1974, I find, at best, one article per year that can be called game-theoretic.

What is the link between Mortensen’s work and developments in microeconomic theory? The essential labor market insight of search models (an insight which predates Mortensen) is that the number of hires and layoffs is substantial even in the depth of a recession. That is, the rise in the unemployment rate cannot simply be because the marginal revenue of the potential workers is always less than the cost, since huge numbers of the unemployed are hired during recessions (as others are fired). Therefore, a model which explains changes in churn rather than changes in the aggregate rate seems qualitatively important if we are to develop policies to address unemployment. This suggests that there might be some use in a model where workers and firms search for each other, perhaps with costs or other frictions. Early models along this line by Mortensen and others were generally one-sided and hence non-strategic: they had the flavor of optimal stopping problems.

Unfortunately, Diamond in a 1971 JET pointed out that Nash equilibrium in two-sided search leads to a conclusion that all workers are paid their reservation wage: all employers pay the reservation wage, workers believe this to be true hence do not engage in costly search to switch jobs, hence the belief is accurate and nobody can profitably deviate. Getting around the “Diamond Paradox” involved enriching the model of who searches when and the extent to which old offers can be recovered; Mortensen’s work with Burdett is a nice example. One also might ask whether laissez faire search is efficient or not: given the contemporaneous work of micro theorists like Glenn Loury on mathematically similar problems like the patent race, you might imagine that efficient search is unlikely.

Beyond the efficiency of matches themselves is the question of how to split surplus. Consider a labor market. In the absence of search frictions, Shapley (first with Gale, later with Shubik) had shown in the 1960s and early 1970s the existence of stable two-sided matches even when “wages” are included. It turns out these stable matches are tightly linked to the cooperative idea of a core. But what if this matching is dynamic? Firms and workers meet with some probability over time. A match generates surplus. Who gets this surplus? Surely you might imagine that the firm should have to pay a higher wage (more of the surplus) to workers who expect to get good future offers if they do not accept the job today. Now we have something that sounds familiar from non-cooperative game theory: wage is based on the endogenous outside options of the two parties. It turns out that noncooperative game theory had very little to say about bargaining until Rubinstein’s famous bargaining game in 1982 and the powerful extensions by Wolinsky and his coauthors. Mortensen’s dynamic search models were a natural fit for those theoretic developments.

I imagine that when people hear “microfoundations”, they have in mind esoteric calibrated rational expectations models. But microfoundations in the style of Mortensen’s work is much more straightforward: we simply cannot understand even the qualitative nature of counterfactual policy in the absence of models that account for strategic behavior. And thus the role for even high micro theory, which investigates the nature of uniqueness of strategic outcomes (game theory) and the potential for a planner to improve welfare through alternative rules (mechanism design). Powerful tools indeed, and well used by Mortensen.

“The Great Diversification and Its Unraveling,” V. Carvalho and X. Gabaix (2013)

I rarely post about macro papers here, but I came across this interesting result by Carvalho and Gabaix in the new AER. Particularly in the mid-2000s, it was fashionable to talk about a “Great Moderation” – many measures of economic volatility fell sharply right around 1983 and stayed low. Many authors studied the potential causes. Was it a result of better monetary policy (as seemed to be the general belief when I working at the Fed) or merely good luck? Ben Bernanke summarized the rough outline of this debate in a 2004 speech.

The last few years have been disheartening for promoters of good policy, since many measures of economic volatility have again soared since the start of the financial crisis. So now we have two facts to explain: why did volatility decline, and why did it rise again? Dupor, among others, has pointed out the difficulty of generating aggregate fluctuations from sectoral shocks: as the number of sectors increase, then independence of shocks will generate little aggregate volatility as the number of sectors grows large. Recent work by the authors of the present paper get around that concern by noting the granularity of important sectors (Gabaix), or the network structure of economic linkages (Carvalho).

In this paper, the authors show theoretically that a measure of “fundamental volatility” in total factor productivity should be linked to volatility in GDP, and that that measure of volatility is essentially composed of three factors: the ratio of sectoral gross output to value added, a diversification effect where volatility declines if the economy where value-added shares of the economy are spread across more sectors, and a compositional effect, where the economy contains fewer sectors with high gross output to GDP. They then compare their measure of fundamental volatility, constructed using data on 88 sectors, to overall GDP volatility, and find that it fits the data well, both in the US and overseas.

What, then, caused the shifts in fundamental volatility? The decline in the early 1980s appears to be heavily driven by a declining share of the economy in machinery, primary metals and the like. These sectors are heavily integrated into other areas of the economy as users and producers of intermediate inputs, so it is no surprise that a decline in their share of the economy will reduce overall volatility. The rise in the 2000s appears to result almost wholly from the increasing importance of the financial sector, an individually volatile sector. Given that the importance of this sector had been rising since the late 1990s, a measure of fundamental volatility (or, better, a firm-level measure, which is difficult to do given currently existing data) could have provided an “early warning” that the Great Moderation would soon come to an end.

August 2012 working paper (IDEAS version). Final paper in AER 103(5), 2013.

Financial Crisis Reading Lists

The Journal of Economic Literature is really a great service to economists. It is a journal that publishes up-to-date literature reviews and ideas for future research in many small subfields that would otherwise be impenetrable. A recent issue published two articles trying to help us non-macro and finance guys catch up on the “facts” of the 2007 financial crisis. The first is Gorton and Metrick, “Getting up to Speed on the Global Financial Crisis: A One Weekend Reader’s Guide,” and the second is Andrew Lo’s “Reading About the Financial Crisis: A Twenty-One Book Review.”

There is a lot of popular confusion about what caused the financial crisis, what amplified it, and what the responses have been. Roughly, we can all agree that first there was a massive rise in house prices, not only in the US; second, there was new, enormous pools of institutional investments looking for safe returns, with many of these pools being operated by risk-averse Asian governments; third, house prices peaked in the US and elsewhere in 2006; fourth, in August 2007, problems with mortgage-related bonds led to interbank repo funding problems, requiring massive liquidity help from central banks; fifth, in September 2008, Lehman Brothers filed for bankruptcy, leading a money market fund to “break the buck” and causing massive flight away from assets related to investment banks or assets not explicitly backed by strong governments; sixth, a sovereign debt problem has arisen in a number of periphery countries since, particularly in Europe.

Looking through the summaries provided in these two reading lists, I only see four really firm additional facts. First, as Andrew Lo has pointed out many times, leverage at investment banks was not terribly high by global standards. Second, arguments that the crisis was caused by investment banks packaging worthless securities and then fooling buyers, while containing a grain of truth, does not explain the crisis: indeed, the bigger problem was how many of these worthless securities were still on bank’s own balance sheets in 2008, explicitly or implicitly through CDOs and other instruments. Third, rising total leverage across an economy as a whole is strongly related to banking crises, a point made best in Reinhart and Rogoff’s work, but also in a new AER by Schularick and Taylor. Fourth, the crisis in the financial sector transmitted to the real economy principally via restrictions on credit to real economy firms. Campello, Graham and Harvey, in a 2010 JFE, used a large-scale 2008 survey of global CFOs to show how firms who were credit constrained before the financial crisis were much more likely to have to cut back on hiring and investment spending, regardless of their profitability or the usefulness of their investment opportunities. That is, savings fled to safety because of the uncertain health of banking intermediaries, which led banks to cut back commercial lending, which led to a recession in the real economy.

What’s interesting is how little the mortgage market, per se, had to do with the crisis. I was at the Fed before the crisis, and remember coauthoring in early 2007 an internal memo about the economic effects of a downturn in the housing sector. The bubble in housing prices was obvious to (almost) everyone at the Fed. But the size of the mortgage market (in terms of wealth) and the construction and home-improvement sector (in terms of employment) was simply not that big; certainly, the massive stock losses after the dot-com bubble had more real effects because of declines in total wealth. I can only imagine that everyone on Wall Street also knew this. What was unexpected was the way in which these particular losses in wealth harmed the financial health of banks; in particular, the location of losses because of the huge number of novel derivatives was really opaque. And we know, both then because of the very-popular-at-the-Fed theoretical work of Diamond and Dybvig, and now because of the empirical work of Reinhart and Rogoff, that bank runs, whether in the proper or in the “shadow” banking system, have real effects that are very difficult to contain.

If you’ve got some free time this weekend, particularly if you’re not a macroeconomist, it’s worth looking through the references in the Lo and Gorton/Metrick papers.

Lo’s “Twenty-One Book Review” (2012) (IDEAS version). Gorton and Metrick’s “Getting Up to Speed” (IDEAS version).

“The Credit Crisis as a Problem in the Sociology of Knowledge,” D. Mackenzie (2011)

(Tip of the hat for pointing out Mackenzie’s article to Dan Hirschman)

The financial crisis, it is quite clear by now, will be the worst worldwide economic catastrophe since the Great Depression. There are many explanations involving mistaken or misused economic theory, rapaciousness, political decisions, ignorance, and many more; two interesting examples here are Alp Simsek’s job market paper from a couple years ago on the impact of overly optimistic potential buyers who need to get loans from sedate lenders (one takeaway for me was that financial problems can’t be driven by the ignorant masses, as they have no money), and Coven, Jurek and Stafford’s brilliant 2009 AER on catastrophe bonds (summary here) which points out how ridiculous it is to legally define risk in terms of default risk, since we have known for decades in theory that Arrow-Debreu securities’ values depend both on the payoffs in future states and on the relative prices in those states. A bond whose default occurs in catastrophic states ought be much more expensive than the same bond whose default is negatively correlated with background risk.

But the catastrophe also involves a sociological component. Markets are made: they don’t arise from thin air. Certain markets don’t exist for reasons of repulsion, as Al Roth has mentioned in the context of organ sales. Other markets don’t exist because the value of the proposed good in that market is not clear. Removing uncertainty and clarifying the nature of a good is a important precondition, and one that economic sociologists, including Donald Mackenzie, have discussed at great length in their work. The evaluation of new products, perhaps not surprisingly, depends both on analogies to forms a firm has seen before, and on the particular parts of the firm who handle the evaluation.

Consider the ABS CDO – a collateralized debt obligation where the underlying debt are securitized assets, most commonly mortgages. The ABS CDO market grew enormously in the 2000s, and was not understood at nearly the same level as traditional CDO or ABS evaluation, topics on which there are hundreds of research papers. ABS and CDO teams tended to be quite separate in investment banks and ratings agencies, with the CDO team generally well trained in derivatives and the highly quantitative evaluation procedures of such products. For ABSs, particularly US mortgages, the implicit government guarantee against default meant that prepayment risk was the most important factor when pricing such securities. CDOs, often based on corporate debt, were used to treating correlation between various corporations in a given CDO as the most important metric.

Mackenzie gives exhaustive individual detail, but roughly, he does not blame the massive default rates on even AAA-rated ABS CDOs on greed or malfeasance. Rather, he describes how evaluation of ABS CDOs by ratings agencies used to dealing with either an ABS or a CDO, but not both, could lead to a utter misunderstanding of risk. While it is perfectly possible to “drill down” a complex derivative into its constituent parts, then subject the individual derivative to a stress test against some macroeconomic hypothetical, this was rarely done, particularly by individual investors. Mackenzie also gives a brief story of why these assets, revealed in 2008 to be superbly high risk, were being held by the banks at all instead of sold off to hedge funds and pensions. Apparently, the assets held were generally ones with very low return and very low perceived risk which were created as a byproduct of the bundling that created the ABS CDOs. That is, arbitrage was created when individual ABSs were bundled into an ABS CDO, the mezzanine and other tranches aside from the most senior AAA tranche were sold off, and the “basically risk-free” senior tranches were held by the bank as they would be difficult to sell directly. The evaluation of the risk, of course, was mistaken.

This is a very interesting descriptive presentation of what happened in 07 and 08. (Final version from the May 2011 American Journal of Sociology)


Get every new post delivered to your Inbox.

Join 311 other followers

%d bloggers like this: