Category Archives: Diffusion

William Baumol: Truly Productive Entrepreneurship

It seems this weblog has become an obituary page rather than a simple research digest of late. I am not even done writing on the legacy of Ken Arrow (don’t worry – it will come!) when news arrivesĀ that yet another product of the World War 2 era in New York City, an of the CCNY system, has passed away: the great scholar of entrepreneurship and one of my absolute favorite economists, William Baumol.

But we oughtn’t draw the line on his research simply at entrepreneurship, though I will walk you through his best piece in the area, a staple of my own PhD syllabus, on “creative, unproductive, and destructive” entrepreneurship. Baumol was also a great scholar of the economics of the arts, performing and otherwise, which were the motivation for his famous cost disease argument. He was a very skilled micro theorist, a talented economic historian, and a deep reader of the history of economic thought, a nice example of which is his 2000 QJE on what we have learned since Marshall. In all of these areas, his papers are a pleasure to read, clear, with elegant turns of phrase and the casual yet erudite style of an American who’d read his PhD in London under Robbins and Viner. That he has passed without winning his Nobel Prize is a shame – how great would it have been had he shared a prize with Nate Rosenberg before it was too late for them both?

Baumol is often naively seen as a Schumpeter-esque defender of the capitalist economy and the heroic entrepreneur, and that is only half right. Personally, his politics were liberal, and as he argued in a recent interview, “I am well aware of all the very serious problems, such as inequality, unemployment, environmental damage, that beset capitalist societies. My thesis is that capitalism is a special mechanism that is uniquely effective in accomplishing one thing: creating innovations, applying those innovations and using them to stimulate growth.” That is, you can find in Baumol’s work many discussions of environmental externalities, of the role of government in funding research, in the nature of optimal taxation. You can find many quotes where Baumol expresses interest in the policy goals of the left (though often solved with the mechanism of the market, and hence the right). Yet the core running through much of Baumol’s work is a rigorous defense, historically and theoretically grounded, in the importance of getting incentives correct for socially useful innovation.

Baumol differs from many other prominent economists of innovation because is at his core a neoclassical theorist. He is not an Austrian like Kirzner or an evolutionary economist like Sid Winter. Baumol’s work stresses that entrepreneurs and the innovations they produce are fundamental to understanding the capitalist economy and its performance relative to other economic systems, but that the best way to understand the entrepreneur methodologically was to formalize her within the context of neoclassical equilibria, with innovation rather than price alone being “the weapon of choice” for rational, competitive firms. I’ve always thought of Baumol as being the lineal descendant of Schumpeter, the original great thinker on entrepreneurship and one who, nearing the end of his life and seeing the work of his student Samuelson, was convinced that his ideas should be translated into formal neoclassical theory.

A 1968 essay in the AER P&P laid out Baumol’s basic idea that economics without the entrepreneur is, in a line he would repeat often, like Hamlet without the Prince of Denmark. He clearly understood that we did not have a suitable theory for oligopoly and entry into new markets, or for the supply of entrepreneurs, but that any general economic theory needed to be able to explain why growth is different in different countries. Solow’s famous essay convinced much of the profession that the residual, interpreted then primarily as technological improvement, was the fundamental variable explaining growth, and Baumol, like many, believed those technological improvements came mainly from entrepreneurial activity.

But what precisely should the theory look like? Ironically, Baumol made his most productive step in a beautiful 1990 paper in the JPE which contains not a single formal theorem nor statistical estimate of any kind. Let’s define an entrepreneur as “persons who are ingenious or creative in finding ways to add to their wealth, power, or prestige”. These people may introduce new goods, or new methods of production, or new markets, as Schumpeter supposed in his own definition. But are these ingenious and creative types necessarily going to do something useful for social welfare? Of course not – the norms, institutions, and incentives in a given society may be such that the entrepreneurs perform socially unproductive tasks, such as hunting for new tax loopholes, or socially destructive tasks, such as channeling their energy into ever-escalating forms of warfare.

With the distinction between productive, unproductive, and destructive entrepreneurship in mind, we might imagine that the difference in technological progress across societies may have less to do with the innate drive of the society’s members, and more to do with the incentives for different types of entrepreneurship. Consider Rome, famously wealthy yet with very little in the way of useful technological diffusion: certainly the Romans appear less innovative than either the Greeks or Europe of the Middle Ages. How can a society both invent a primitive steam engine – via Herod of Alexandria – and yet see it used for nothing other than toys and religious ceremonies? The answer, Baumol notes, is that status in Roman society required one to get rich via land ownership, usury, or war; commerce was a task primarily for slaves and former slaves! And likewise in Song dynasty China, where imperial examinations were both the source of status and the ability to expropriate any useful inventions or businesses that happened to appear. In the European middle ages, incentives shift for the clever from developing war implements to the diffusion of technology like the water-mill under the Cistercians back to weapons. These examples were expanded to every society from Ancient Mesopotamia to the Dutch Republic to the modern United States in a series of economically-minded historians in a wonderful collection of essays called “The Invention of Enterprise” which was edited by Baumol alongside Joel Mokyr and David Landes.

Now we are approaching a sort of economic theory of entrepreneurship – no need to rely on the whims of character, but instead focus on relative incentives. But we are still far from Baumol’s 1968 goal: incorporating the entrepreneur into neoclassical theory. The closest Baumol comes is in his work in the early 1980s on contestable markets, summarized in the 1981 AEA Presidential Address. The basic idea is this. Assume industries have scale economies, so oligopoly is their natural state. How worried should we be? Well, if there are no sunk costs and no entry barriers for entrants, and if entrants can siphon off customers quicker than incumbents can respond, then Baumol and his coauthors claimed that the market was contestable: the threat of entry is sufficient to keep the incumbent from exerting their market power. On the one hand, fine, we all agree with Baumol now that industry structure is endogenous to firm behavior, and the threat of entry clearly can restrain market power. But on the other hand, is this “ultra-free entry” model the most sensible way to incorporate entry and exit into a competitive model? Why, as Dixit argued, is it quicker to enter a market than to change price? Why, as Spence argued, does the unrealized threat of entry change equilibrium behavior if the threat is truly unrealized along the equilibrium path?

It seems that what Baumol was hoping this model would lead to was a generalized theory of perfect competition that permitted competition for the market rather than just in the market, since the competition for the market is naturally the domain of the entrepreneur. Contestable markets are too flawed to get us there. But the basic idea, that game-theoretic endogenous market structure, rather than the old fashioned idea that industry structure affects conduct affects performance, is clearly here to stay: antitrust is essentially applied game theory today. And once you have the idea of competition for the market, the natural theoretical model is one where firms compete to innovate in order to push out incumbents, incumbents innovate to keep away from potential entrants, and profits depend on the equilibrium time until the dominant firm shifts: I speak, of course, about the neo-Schumpeterian models of Aghion and Howitt. These models, still a very active area of research, are finally allowing us to rigorously investigate the endogenous rewards to innovation via a completely neoclassical model of market structure and pricing.

I am not sure why Baumol did not find these neo-Schumpeterian models to be the Holy Grail he’d been looking for; in his final book, he credits them for being “very powerful” but in the end holding different “central concerns”. He may have been mistaken in this interpretation. It proved quite interesting to give a careful second read of Baumol’s corpus on entrepreneurship, and I have to say it disappoints in part: the questions he asked were right, the theoretical acumen he possessed was up to the task, the understanding of history and qualitative intuition was second to none, but in the end, he appears to have been just as stymied by the idea of endogenous neoclassical entrepreneurship as the many other doyens of our field who took a crack at modeling this problem without, in the end, generating the model they’d hoped they could write.

Where Baumol has more success, and again it is unusual for a theorist that his most well-known contribution is largely qualitative, is in the idea of cost disease. The concept comes from Baumol’s work with William Bowen (see also this extension with a complete model) on the economic problems of the performing arts. It is a simple idea: imagine productivity in industry rises 4% per year, but “the output per man-hour of a violinist playing a Schubert quarter in a standard concert hall” remains fixed. In order to attract workers into music rather than industry, wages must rise in music at something like the rate they rise in industry. But then costs are increasing while productivity is not, and the arts looks “inefficient”. The same, of course, is said for education, and health care, and other necessarily labor-intensive industries. Baumol’s point is that rising costs in unproductive sectors reflect necessary shifts in equilibrium wages rather than, say, growing wastefulness.

How much can cost disease explain? Because the concept is so widely known by now that it is, in fact, used to excuse stagnant industries. Teaching, for example, requires some labor, but does anybody believe that it is impossible for R&D and complementary inventions (like the internet, for example) to produce massive productivity improvements? Is it not true that movie theaters now show opera live from the world’s great halls on a regular basis? Is it not true that my Google Home can, activated by voice, call up two seconds from now essentially any piece of recorded music I desire, for free? Speculating about industries that are necessarily labor-intensive (and hence grow slowly) from those with rapid technological progress is a very difficult game, and one we ought hesitate to play. But equally, we oughtn’t forget Baumol’s lesson: in some cases, in some industries, what appears to be fixable slack is in fact simply cost disease. We may ask, how was it that Ancient Greece, with its tiny population, put on so many plays, while today we hustle ourselves to small ballrooms in New York and London? Baumol’s answer, rigorously shown: cost disease. The “opportunity cost” of recruiting a big chorus was low, as those singers would otherwise have been idle or working unproductive fields gathering olives. The difference between Athens and our era is not simply that they were “more supportive of the arts”!

Baumol was incredibly prolific, so these suggestions for further reading are but a taste: An interview by Alan Krueger is well worth the read for anecdotes alone, like the fact that apparently one used to do one’s PhD oral defense “over whiskies and sodas at the Reform Club”. I also love his defense of theory, where if he is very lucky, his initial intuition “turn[s] out to be totally wrong. Because when I turn out to be totally wrong, that’s when the best ideas come out. Because if my intuition was right, it’s almost always going to be simple and straightforward. When my intuition turns out to be wrong, then there is something less obvious to explain.” Every theorist knows this: formalization has this nasty habit of refining our intuition and convincing us our initial thoughts actually contain logical fallacies or rely on special cases! Though known as an applied micro theorist, Baumol also wrote a canonical paper, with Bradford, on optimal taxation: essentially, if you need to raise $x in tax, how should you optimally deviate from marginal cost pricing? The history of thought is nicely diagrammed, and of course this 1970 paper was very quickly followed by the classic work of Diamond and Mirrlees. Baumol wrote extensively on environmental economics, drawing in many of his papers on the role nonconvexities in the social production possibilities frontier play when they are generated by externalities – a simple example of this effect, and the limitations it imposes on Pigouvian taxation, is in the link. More recently, Baumol has been writing on international trade with Ralph Gomory (the legendary mathematician behind a critical theorem in integer programming, and later head of the Sloan Foundation); their main theorems are not terribly shocking to those used to thinking in terms of economies of scale, but the core example in the linked paper is again a great example of how nonconvexities can overturn a lot of our intuition, in the case on comparative advantage. Finally, beyond his writing on the economics of the arts, Baumol proved that there is no area in which he personally had stagnant productivity: an art major in college, he was also a fantastic artist in his own right, picking up computer-generated art while in his 80s and teaching for many years a course on woodworking at Princeton!

“Scale versus Scope in the Diffusion of New Technology,” D. Gross (2016)

I am spending part of the fall down at Duke University visiting the well-known group of innovation folks at Fuqua and co-teaching a PhD innovation course with Wes Cohen, who you may know via his work on Absorptive Capacity (EJ, 1989), the “Carnegie Mellon” survey of inventors with Dick Nelson and John Walsh, and his cost sharing R&D argument (article gated) with Steven Klepper. Last week, the class went over a number of papers on the diffusion of technology over space and time, a topic of supreme importance in the economics of innovation.

There are some canonical ideas in diffusion. First, cumulative adoption on the extensive margin – are you or your firm using technology X – follows an S-curve, rising slowly, then rapidly, then slowly again until peak adoption is reached. This fact is known to economists thanks to Griliches 1957 but the idea was initially developed by social psychologists and sociologists. Second, there are massive gaps in the ability of firms and nations to adopt and quickly diffuse new technologies – Diego Comin and Burt Hobijn have written a great deal on this problem. Third, the reason why technologies are slow to adopt depends on many factors, including social learning (e.g., Conley and Udry on pineapple growing in Ghana), pure epidemic-style network spread (the “Bass model”), capital replacement, “appropriate technologies” arriving once conditions are appropriate, and many more.

One that is very much underrated, however, is that technologies diffuse because they and their complements change over time. Dan Gross from HBS, another innovation scholar who likes delving into history, has a great example: the early tractor. The tractor was, in theory, invented in the 1800s, but was uneconomical and not terribly useful. With an invention by Ford in the 1910s, tractors began to spread, particularly among the US wheat belt. The tractor eventually spreads to the rest of the Midwest in the late 1920s and 1930s. A back-of-the-envelope calculation by Gross suggests the latter diffusion saved something like 10% of agricultural labor in the areas where it spread. Why, then, was there such a lag in many states?

There are many hypotheses in the literature: binding financial constraints, differences in farm sizes that make tractors feasible in one area and not another, geographic spread via social learning, and so on. Gross’ explanation is much more natural: early tractors could not work with crops like corn, and it wasn’t until after a general purpose tractor was invented in the 1920s that complementary technologies were created allowing the tractor to be used on a wide variety of farms. The charts are wholly convincing on this point: tractor diffusion time is very much linked to dominant crop, the early tractor “skipped” geographies where were inappropriate, and farms in areas where tractors diffused late nonetheless had substantial diffusion of automobiles, suggesting capital constraints were not the binding factor.

But this leaves one more question: why didn’t someone modify the tractor to make it general purpose in the first place? Gross gives a toy model that elucidates the reason quite well. Assume there is a large firm that can innovate on a technology, and can either develop a general purpose or applied versions of the technology. Assume that there is a fringe of firms that can develop complementary technology to the general purpose one (a corn harvester, for instance). If the large firm is constrained in how much innovation it can perform at any one time, it will first work on the project with highest return. If the large firm could appropriate the rents earned by complements – say, via a licensing fee – it would like to do so, but that licensing fee would decrease the incentive to develop the complements in the first place. Hence the large firm may first work on direct applications where it can capture a larger share of rents. This will imply that technology diffuses slowly first because applications are very specialized, then only as the high-return specialties have all been developed will it become worthwhile to shift researchers over to the general purpose technology. The general purpose technology will induce complements and hence rapid diffusion. As adoption becomes widespread, the rate of adoption slows down again. That is, the S-curve is merely an artifact of differing incentives to change the scope of an invention. Much more convincing that reliance on behavioral biases!

2016 Working Paper (RePEc IDEAS version). I have a paper with Jorge Lemus at Illinois on the problem of incentivizing firms to work on the right type of project, and the implications thereof. We didn’t think in terms of product diffusion, but the incentive to create general purpose technologies can absolutely be added straight into a model of that type.

On the economics of the Neolithic Revolution

The Industrial and Neolithic Revolutions are surely the two fundamental transitions in the economic history of mankind. The Neolithic involved permanent settlement of previously nomadic, or at best partially foraging, small bands. At least seven independent times, bands somewhere in the world adopted settled agriculture. The new settlements tended to see an increase in inequality, the beginning of privately held property, a number of new customs and social structures, and, most importantly, an absolute decrease in welfare as measured in terms of average height and an absolute increase in the length and toil of working life. Of course, in the long run, settlement led to cities which led to the great inventions that eventually pushed mankind past the Malthusian bounds into our wealthy present, but surely no nomad of ten thousand years ago could have projected that outcome.

Now this must sound strange to any economist, as we can’t help but think in terms of rational choice. Why would any band choose to settle when, as far as we can tell, settling made them worse off? There are only three types of answers compatible with rational choice: either the environment changed such that the nomads who adopted settlement would have been even worse off had they remained nomadic, settlement was a Pareto-dominated equilibrium, or our assumption that the nomads were maximizing something correlated with height is wrong. All might be possible: early 20th century scholars ascribed the initial move to settlement to humans being forced onto oases in the drying post-Ice Age Middle East, evolutionary game theorists are well aware that fitness competitions can generate inefficient Prisoner’s Dilemmas, and humans surely care about reproductive success more than they care about food intake per se.

So how can we separate these potential explanations, or provide greater clarity as to the underlying Neolithic transition mechanism? Two relatively new papers, Andrea Matranga’s “Climate-Driven Technical Change“, and Kim Sterelny’s Optimizing Engines: Rational Choice in the Neolithic”, discuss intriguing theories about what may have happened in the Neolithic.

Matranga writes a simple Malthusian model. The benefit of being nomadic is that you can move to places with better food supply. The benefit of being sedentary is that you use storage technology to insure yourself against lean times, even if that insurance comes at the cost of lower food intake overall. Nomadism, then, is better than settling when there are lots of nearby areas with uncorrelated food availability shocks (since otherwise why bother to move?) or when the potential shocks you might face across the whole area you travel are not that severe (in which case why bother to store food?). If fertility depends on constant access to food, then for Malthusian reasons the settled populations who store food will grow until everyone is just at subsistence, whereas the nomadic populations will eat a surplus during times when food is abundant.

It turns out that global “seasonality” – or the difference across the year in terms of temperature and rainfall – was extraordinarily high right around the time agriculture first popped up in the Fertile Crescent. Matranga uses some standard climatic datasets to show that six of the seven independent inventions of agriculture appear to have happened soon after increases in seasonality in their respective regions. This is driven by an increase in seasonality and not just an increase in rainfall or heat: agriculture appears in the cold Andes and in the hot Mideast and in the moderate Chinese heartland. Further, adoption of settlement once your neighbors are farming is most common when you live on relatively flat ground, with little opportunity to change elevation to pursue food sources as seasonality increases. Biological evidence (using something called “Harris lines” on your bones) appears to support to idea that nomads were both better fed yet more subject to seasonal shocks than settled peoples.

What’s nice is that Matranga’s hypothesis is consistent with agriculture appearing many times independently. Any thesis that relies on unique features of the immediate post-Ice Age – such as the decline in megafauna like the Woolly Mammoth due to increasing population, or the oasis theory – will have a tough time explaining the adoption of agriculture in regions like the Andes or China thousands of years after it appeared in the Fertile Crescent. Alain Testart and colleagues in the anthropology literature have made similar claims about the intersection of storage technology and seasonality being important for the gradual shift from nomadism to partial foraging to agriculture, but the Malthusian model and the empirical identification in Matranga will be much more comfortable for an economist reader.

Sterelny, writing in the journal Philosophy of Science, argues that rational choice is a useful framework to explain not only why backbreaking, calorie-reducing agriculture was adopted, but also why settled societies appeared willing to tolerate inequality which was much less common in nomadic bands, and why settled societies exerted so much effort building monuments like Gobekli Tepe, holding feasts, and participating in other seemingly wasteful activity.

Why might inequality have arisen? Settlements need to be defended from thieves, as they contain stored food. Hence settlement sizes may be larger than the size of nomadic bands. Standard repeated games with imperfect monitoring tell us that when repeated interactions become less common, cooperation norms become hard to sustain. Hence collective action can only be sustained through mechanisms other than dyadic future punishment; this is especially true if farmers have more private information about effort and productivity than a band of nomadic hunters. The rise of enforceable property rights, as Bowles and his coauthors have argued, is just such a mechanism.

What of wasteful monuments like Gobekli Tepe? Game theoretic deliberate choice provides two explanations for such seeming wastefulness. First, just as animals consume energy in ostentatious displays in order to signal their fitness (as the starving animal has no energy to generate such a display), societies may construct totems and temples in order to signal to potential thieves that they are strong and not worth trifling with. In the case of Gobekli Tepe, this doesn’t appear to be the case, as there isn’t much archaeological evidence of particular violence around the monument. A second game theoretic rationale, then, is commitment by members of a society. As Sterelny puts it, the reason a gang makes a member get a face tattoo is that, even if the member leaves the gang, the tattoo still puts that member at risk of being killed by the gang’s enemies. Hence the tattoo commits the member not to defect. Settlements around Gobekli Tepe may have contributed to its building in order to commit their members to a set of norms that the monument embodied, and hence permit trade and knowledge transfer within this in-group. I would much prefer to see a model of this hypothesis, but the general point doesn’t seem impossible. At least, Sterelny and Matranga together provide a reasonably complete possible explanation, based on rational behavior and nothing more, of the seemingly-strange transition away from nomadism that made our modern life possible.

Kim Sterelny, Optimizing Engines: Rational Choice in the Neolithic?, 2013 working paper. Final version published in the July 2015 issue of Philosophy of Science. Andrea Matranga, “Climate-driven Technical Change: Seasonality and the Invention of Agriculture”, February 2015 working paper, as yet unpublished. No RePEc IDEAS page is available for either paper.

Labor Unions and the Rust Belt

I’ve got two nice papers for you today, both exploring a really vexing question: why is it that union-heavy regions of the US have fared so disastrously over the past few decades? In principle, it shouldn’t matter: absent any frictions, a rational union and a profit-maximizing employer ought both desire to take whatever actions generate the most total surplus for the firm, with union power simply affecting how those rents are shared between management, labor and owners. Nonetheless, we notice empirically a couple of particularly odd facts. First, especially in the US, union-dominated firms tend to limit adoption of new, productivity-enhancing technology; the late adoption of the radial tire among U.S. firms is a nice example. Second, unions often negotiate not only about wages but about “work rules”, insisting upon conditions like inflexible employee roles. A great example here is a California longshoremen contract which insisted upon a crew whose sole job was to stand and watch while another crew did the job. Note that preference for leisure can’t explain this, since surely taking that leisure at home rather than standing around the worksite would be preferable for the employees!

What, then, might drive unions to push so hard for seemingly “irrational” contract terms, and how might union bargaining power under various informational frictions or limited commitment affect the dynamic productivity of firms? “Competition, Work Rules and Productivity” by the BEA’s Benjamin Bridgman discusses the first issue, and a new NBER working paper, “Competitive Pressure and the Decline of the Rust Belt: A Macroeconomic Analysis” by Alder, Lagakos and Ohanian covers the second; let’s examine these in turn.

First, work rules. Let a union care first about keeping all members employed, and about keeping wage as high as possible given full employment. Assume that the union cannot negotiate the price at which products are sold. Abstractly, work rules are most like a fixed cost that is a complete waste: no matter how much we produce, we have to incur some bureaucratic cost of guys standing around and the like. Firms will set marginal revenue equal to marginal cost when deciding how much to produce, and at what price that production should be sold. Why would the union like these wasteful costs?

Let firm output given n workers just be n-F, where n is the number of employees, and F is how many of them are essentially doing nothing because of work rules. The firm chooses price p and the number of employees n given demand D(p) and wage w to maximize p*D(p)-w*n, subject to total production being feasible D(p)=n-F. Note that, as long as total firm profits under optimal pricing exceed F, the firm stays in business and its pricing decision, letting marginal revenue equal marginal cost, is unaffected by F. That is, the optimal production quantity does not depend on F. However, the total amount of employment does depend on F, since to produce quantity D(p) you need to employ n-F workers. Hence there is a tradeoff if the union only negotiates wages: to employ more people, you need a lower wage, but using wasteful work rules, employment can be kept high even when wages are raised. Note also that F is limited by the total rents earned by the firm, since if work rules are particularly onerous, firms that are barely breaking even without work rules will simply shut down. Hence in more competitive industries (formally, when demand is less elastic), work rules are less likely to imposed by unions. Bridgman also notes that if firms can choose technology (output is An-F, where A is the level of technology), then unions will resist new technology unless they can impose more onerous work rules, since more productive technology lowers the number of employees needed to produce a given amount of output.

This is a nice result. Note that the work rule requirements have nothing to do with employees not wanting to work hard, since work rules in the above model are a pure waste and generate no additional leisure time for workers. Of course, this result really hinges on limiting what unions can bargain over: if they can select the level of output, or can impose the level of employment directly, or can permit lump-sum transfers from management to labor, then unionized firms will produce at the same productivity at non-unionized firms. Information frictions, among other worries, might be a reason why we don’t see these types of contracts at some unionized firms. With this caveat in mind, let’s turn to the experience of the Rust Belt.

The U.S. Rust Belt, roughly made up of states surrounding the Great Lakes, saw a precipitous decline from the 1950s to today. Alder et al present the following stylized facts: the share of manufacturing employment in the U.S. located in the Rust Belt fell from the 1950s to the mid-1980s, there was a large wage gap between Rust Belt and other U.S. manufacturing workers during this period, Rust Belt firms were less likely to adopt new innovations, and labor productivity growth in Rust Belt states was lower than the U.S. average. After the mid-1980s, Rust Belt manufacturing firms begin to look a lot more like manufacturing firms in the rest of the U.S.: the wage gap is essentially gone, the employment share stabilizes, strikes become much less common, and productivity growth is similar. What happened?

In a nice little model, the authors point out that output competition (do I have lots of market power?) and labor market bargaining power (are my workers powerful enough to extract a lot of my rents?) interact in an interesting way when firms invest in productivity-increasing technology and when unions cannot commit to avoid a hold-up problem by striking for a better deal after the technology investment cost is sunk. Without commitment, stronger unions will optimally bargain away some of the additional rents created by adopting an innovation, hence unions function as a type of tax on innovation. With sustained market power, firms have an ambiguous incentive to adopt new technology – on the one hand, they already have a lot of market power and hence better technology will not accrue too many more sales, but on the other hand, having market power in the future makes investments today more valuable. Calibrating the model with reasonable parameters for market power, union strength, and various elasticities, the authors find that roughly 2/3 of the decline in the Rust Belt’s manufacturing share can be explained by strong unions and little output market competition decreasing the incentive to invest in upgrading technology. After the 1980s, declining union power and more foreign competition limited both disincentives and the Rust Belt saw little further decline.

Note again that unions and firms rationally took actions that lowered the total surplus generated in their industry, and that if the union could have committed not to hold up the firm after an innovation was adopted, optimal technology adoption would have been restored. Alder et al cite some interesting quotes from union heads suggesting that the confrontational nature of U.S. management-union relations led to a belief that management figures out profits, and unions figure out to secure part of that profit for their members. Both papers discussed here show that this type of division, by limiting the nature of bargains which can be struck, can have calamitous effects for both workers and firms.

Bridgman’s latest working paper version is here (RePEc IDEAS page); the latest version of Adler, Lagakos and Ohanian is here (RePEc IDEAS). David Lagakos in particular has a very nice set of recent papers about why services and agriculture tend to have such low productivity, particularly in the developing world; despite his macro background, I think he might be a closet microeconomist!

“Immigration and the Diffusion of Technology: The Huguenot Diaspora in Prussia,” E. Hornung (2014)

Is immigration good for natives of the recipient country? This is a tough question to answer, particularly once we think about the short versus long run. Large-scale immigration might have bad short-run effects simply because more L plus fixed K means lower average incomes in essentially any economic specification, but even given that fact, immigrants bring with them tacit knowledge of techniques, ideas, and plans which might be relatively uncommon in the recipient country. Indeed, world history is filled with wise leaders who imported foreigners, occasionally by force, in order to access their knowledge. As that knowledge spreads among the domestic population, productivity increases and immigrants are in the long-run a net positive for native incomes.

How substantial can those long-run benefits be? History provides a nice experiment, described by Erik Hornung in a just-published paper. The Huguenots, French protestants, were largely expelled from France after the Edict of Nantes was revoked by the Sun King, Louis XIV. The Huguenots were generally in the skilled trades, and their expulsion to the UK, the Netherlands and modern Germany (primarily) led to a great deal of tacit technology transfer. And, no surprise, in the late 17th century, there was very little knowledge transfer aside from face-to-face contact.

In particular, Frederick William, Grand Elector of Brandenburg, offered his estates as refuge for the fleeing Huguenots. Much of his land had been depopulated in the plagues that followed the Thirty Years’ War. The centralized textile production facilities sponsored by nobles and run by Huguenots soon after the Huguenots arrived tended to fail quickly – there simply wasn’t enough demand in a place as poor as Prussia. Nonetheless, a contemporary mentions 46 professions brought to Prussia by the Huguenots, as well as new techniques in silk production, dyeing fabrics and cotton printing. When the initial factories failed, knowledge among the apprentices hired and purchased capital remained. Technology transfer to natives became more common as later generations integrated more tightly with natives, moving out of Huguenot settlements and intermarrying.

What’s particularly interesting with this history is that the quantitative importance of such technology transfer can be measured. In 1802, incredibly, the Prussians had a census of manufactories, or factories producing stock for a wide region, including capital and worker input data. Also, all immigrants were required to register yearly, and include their profession, in 18th century censuses. Further, Huguenots did not simply move to places with existing textile industries where their skills were most needed; indeed, they tended to be placed by the Prussians in areas which had suffered large population losses following the Thirty Years’ War. These population losses were highly localized (and don’t worry, before using population loss as an IV, Hornung makes sure that population loss from plague is not simply tracing out existing transportation highways). Using input data to estimate a Cobb-Douglas textile production function, an additional percentage point of the population with Huguenot origins in 1700 is associated with a 1.5 percentage point increase in textile productivity in 1800. This result is robust in the IV regression using wartime population loss to proxy for the percentage of Huguenot immigrants, as well as many other robustness checks. 1.5% is huge given the slow rate of growth in this era.

An interesting historical case. It is not obvious to me how relevant this estimation to modern immigration debates; clearly it must depend on the extent to which knowledge can be written down or communicated at distance. I would posit that the strong complementarity of factors of production (including VC funding, etc.) are much more important that tacit knowledge spread in modern agglomeration economies of scale, but that is surely a very difficult claim to investigate empirically using modern data.

2011 Working Paper (IDEAS version). Final paper published in the January 2014 AER.

“The ‘Industrial Revolution’ in the Home: Household Technology and Social Change in the 20th Century,” R. S. Cowan (1976)

The really fascinating thing about the “Second Industrial Revolution” (roughly 1870 until World War I) is how much of its effect is seen first for consumers and only later for production. Electricity is the famous example here; most energy-heavy industries were purposefully located near low-cost energy sources like fast-flowing water. Energy produced via transmitted electricity simply wasn’t that competitive until well into the 20th century in these industries.

Ruth Cowan, a historian, investigated how household production was affected by the introduction of electricity, which in the non-rural US roughly means between 1918 and the Great Depression; electrification rose from 25 percent to 80 percent during this period. Huge amounts of drudgery, once left to housewives and domestic workers, was reduced. Consider the task of ironing. Before electricity (barring gas irons, which were not widespread), ironing involved heating a heavy flatiron on a stove, carrying it to the ironing board and quickly knocking out wrinkles before the heat dissipated, bringing in back to stove, and so on. The replacement of the coal stove by central heating similarly limited tedious work, including constant cleaning of coal dust. Cowan traces diffusion of these technologies in part by examining advertisements in magazines like the Ladies’ Home Journal.

The interesting aspect of this consumer revolution, however, was that it did not in fact reduce the amount of work done by housewives. By the end of the 1920s, urban women, most affected by these technological changes, were still doing more housework per week than rural women. It appears the standard story of how Industrial Revolution technologies affected industry – more specialization, more importance of managerial talent, disappearing emotional content of work – was not true of household production. Instead, upper middle class women no longer employed specialized domestic help (and the implied importance of managerial talent on the part of the housewife), and advertisements for new consumer goods frequently emphasized the emotional content of, e.g., the improved cleanliness of modern appliances with respect to children’s health. Indeed, technological progress tended to significantly increase the number of tasks women were expected to perform within the house. There’s not much reason in economic theory for TFP improvements to lead to reductions or increases in worker skill or autonomy, so perhaps it’s no surprise that the household sector saw a different pattern from certain industrial sectors.

Final version in Technology & Culture Jan 1976. If you’re not familiar with the term “Second Industrial Revolution”, Joel Mokyr has a nice summary of this period of frequent important macro/GPT inventions. Essentially, the big inventions of the late 19th century were much more reliant on scientific knowledge, and much more connected to network effects and increasing returns to scale, than those of the late 18th and early 19th century.

“Does Knowledge Accumulation Increase the Returns to Collaboration?,” A. Agrawal, A. Goldfarb & F. Teodoridis (2012)

The size of academic research “teams” has been increasing, inexorably, in essentially every field over the past few decades. This may be because of bad incentives for researchers (as Stan Liebowitz has argued), or because more expensive capital is required for research as in particle physics, or because communication technology has decreased the cost of collaboration. A much more worrying explanation is, simply, that reaching the research frontier is getting harder. This argument is most closely associated with my adviser Ben Jones, who has noticed that while team size has increased, the average age star researchers do their best work has increased, co-inventors on inventions has increased, and the number of researchers doing work across fields has decreased. If the knowledge frontier is becoming more expensive to reach, theory suggests a role for greater subsidization of early-career researchers and of potential development traps due to the complementary nature of specialized fields.

Agrawal et al use a clever device to investigate whether the frontier is indeed becoming more burdensome. Note that the fact that science advances does not mean, ipso facto, that reaching the frontier is harder: new capital like computers or Google Scholar may make it easier to investigate questions or get up to date in related fields, and certain developments completely subsume previous developments (think of, say, how a user of dynamic programming essentially does not need to bother learning the calculus of variations; the easier but more powerful technique makes the harder but less powerful technique unnecessary). Agrawal et al’s trick is to look at publication trends in mathematics. During the Soviet era, mathematics within the Soviet Union was highly advanced, particularly in certain areas of functional analysis, but Soviet researchers had little ability to interact with non-Soviets and they generally published only in Russian. After the fall of the Soviet Union, there was a “shock” to the knowledge frontier in mathematics as these top Soviet researchers began interacting with other mathematicians. A paper by Borjas and Doran in the QJE last year showed that Soviet mathematics were great in some areas and pretty limited in others. This allows for a diff-in-diff strategy: look at the change in team size following 1990 in fields where Soviets were particularly strong versus fields where the Soviets were weak.

Dropping papers with a Russian-named coauthor, classifying papers by fields using data from the AMS, the authors find that papers in Soviet-heavy fields had the number of coauthors increase from 1.34 to 1.78, whereas Soviet-weak fields teams grew only from 1.26 to 1.55. This difference appears quite robust, and is derived from hundreds of thousands of publications. To check that Soviet-rich fields actually had influence, they note that papers in Soviet-rich subfields cited Soviet-era publications at a greater rate after 1990 than Soviet-poor subfields, and that the increase in coauthoring tended to be driven by papers with a young coauthor. The story here is, roughly, that Soviet emigres would have tooled up young researchers in Soviet-rich fields, and then those young coauthors would have a lot of complementary skills which might drive collaboration with other researchers.

So it appears that the increasing burden of the knowledge frontier does drive some of the increase in team size. The relative importance of this factor, however, is something tough to tease out without some sort of structural model. Getting around the burden of knowledge by making it easier to reach the frontier is also worthy of investigation – a coauthor and I have a pretty cool new paper (still too early to make public) on exactly this topic, showing an intervention that has a social payoff an order of magnitude higher than funding new research.

Oct 2012 working paper (no IDEAS version). As a sidenote, the completely bizarre “copyright notice” on the first page is about the most ridiculous thing I have seen on a working paper recently: besides the fact that authors hold the copyright automatically without such a notice, the paper itself is literally about the social benefits of free knowledge flows! I can only hope that the copyright notice is the result of some misguided university policy.

“Path Dependence,” S. Page (2006)

When we talk about strategic equilibrium, we can talk in a very formal sense, as many refinements with their well-known epistemic conditions have been proposed, the nature of uncertainty in such equilibria has been completely described, the problems of sequential decisionmaking are properly handled, etc. So when we do analyze history, we have a useful tool to describe how changes in parameters altered the equilibrium incentives of various agents. Path dependence, the idea that past realizations of history matter (perhaps through small events, as in Brian Arthur’s work) is widespread. A typical explanation given is increasing returns. If I buy a car in 1900, I make you more likely to buy a car in 1901 by, at the margin, lowering the production cost due to increasing returns to scale or lowering the operating cost by increasing incentives for gas station operators to operate.

This is quite informal, though; worse, the explanation of increasing returns is neither necessary nor sufficient for history-dependence. How can this be? First, consider that “history-dependence” may mean (at least) six different things. History can effect either the path of history, or its long-run outcome. For example, any historical process satisfying the assumptions of the ergodic theorem can be history-dependent along a path, yet still converge to the same state (in the network diffusion paper discussed here last week, a simple property of the network structure tells me whether an epidemic will diffuse entirely in the long-run, but the exact path of that eventual diffusion clearly depends on something much more complicated). We may believe, for instance, that the early pattern of railroads affected the path of settlement of the West without believing that this pattern had much consequence for the 2010 distribution of population in California. Next, history-dependence in the long-run or short-run can depend either on a state variable (from a pre-defined set of states), the ordered set of past realizations, or the unordered set of past realizations (the latter called path and phat dependence, respectively, since phat dependence does not depend on order). History matters in elections due to incumbent bias, but that history-dependence can basically be summed up by a single variable denoting who is the current incumbent, omitting the rest of history’s outcomes. Phat dependence is likely in simple technology diffusion: I adopt a technology as a function of which of my contacts has adopted it, regardless of the order in which they adopted. Path dependence comes up, for example, in models of learning following Aumann and Geanakoplos/Polemarchakis, consensus among a group can be broken if agents do not observe the time at which messages were sent between third parties.

Now consider increasing returns. For which types of increasing returns is this necessary or sufficient? It turns out the answer is, for none of them! Take again the car example, but assume there are three types of cars in 1900, steam, electric and gasoline. For the same reasons that gas-powered cars had increasing returns, steam and electric cars do as well. But the relative strength of the network effect for gas-powered cars is stronger. Page thinks of this as a biased Polya process. I begin with five balls, 3 G, 1 S and 1 E, in an urn. I draw one at random. If I get an S or an E, I return it to the urn with another ball of the same type (thus making future draws of that type more common, hence increasing returns). If I draw a G, I return it to the urn along with 2t more G balls, where t is the time which increments by 1 after each draw. This process converges to having arbitrarily close to all balls of type G, even though S and E balls also exhibit increasing returns.

Why about the necessary condition? Surely, increasing returns are necessary for any type of history-dependence? Well, not really. All I need is some reason for past events to increase the likelihood of future actions of some type, in any convoluted way I choose. One simple mechanism is complementarities. If A and B are complements (adopting A makes B more valuable, and vice versa), while C and D are also complements, then we can have the following situation. An early adoption of A makes B more valuable, increasing the probability of adopting B the next period which itself makes future A more valuable, increasing the probability of adopting A the following period, and so on. Such reasoning is often implicit in the rhetoric linking market-based middle class to a democratic political process: some event causes a private sector to emerge, which increases pressure for democratic politics, which increases protection of capitalist firms, and so on. As another example, consider the famous QWERTY keyboard, the best-known example of path dependence we have. Increasing returns – that is, the fact that owning a QWERTY keyboard makes this keyboard more valuable for both myself and others due to standardization – is not sufficient for killing the Dvorak or other keyboards. This is simple to see: the fact that QWERTY has increasing returns doesn’t mean that the diffusion of something like DVD players is history-dependent. Rather, it is the combination of increasing returns for QWERTY and a negative externality on Dvorak that leads to history-dependence for Dvorak. If preferences among QWERTY and Dvorak are Leontief, and valuations for both have increasing returns, then I merely buy the keyboard I value highest – this means that purchases of QWERTY by others lead to QWERTY lock-in by lowering the demand curve for Dvorak, not merely by raising the demand curve for QWERTY. (And yes, if you are like me and were once told to never refer to effects mediated by the market as “externalities”, you should quibble with the vocabulary here, but the point remains the same.)

All in all interesting, and sufficient evidence that we need a better formal theory and taxonomy of history dependence than we are using now.

Final version in the QJPS (No IDEAS version). The essay is written in a very qualitative/verbal manner, but more because of the audience than the author. Page graduated here at MEDS, initially teaching at Caltech, and his CV lists quite an all-star cast of theorist advisers: Myerson, Matt Jackson, Satterthwaite and Stanley Reiter!

“Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks,” B.A. Prakash et al (2011)

No need to separate economics from the rest of the social sciences, and no need to separate social science from the rest of science: we often learn quite a bit from our compatriot fields. Here’s a great example. Consider any epidemic diffusion, where a population (of nodes) is connected to each other (along, in this case, unweighted edges, equal to 1 if and only if there is a link between the nodes). Consider the case where nodes can become “infected” – in economics, we may think of nodes as people or cities adopting a new technology, or purchasing a new product. Does a given seeding on the network lead to an “infection” that spreads across the network, or is the network fairly impervious to infections?

This seems like it must be a tricky question, for nodes can be connected to other nodes in an arbitrary fashion. Let’s make it even more challenging for the analyst: allow there to be m “susceptible” states, an “exposed” state, an “infected” state, and N “vaccinated” states, who cannot be infected. Only exposed or infected agents can propagate an infection, and do so to each of their neighbors in any given period according to probabilities a and b, independently across neighbors. Parameters tell me the probability each agent transitions from susceptible or vaccinated states to other such states.

You may know the simple SIR model – susceptible, infected, recovered. In these models, all agents begin as susceptible pr infected. If my neighbor is infected and I am susceptible, he gives me the disease with probability a. If I am infected, I recover with probability c. This system spreads across the population if the first eigenvalue of the adjacency matrix (which equals 1 if two people are connected, and 0 otherwise) is greater than a/c. (Incredibly, I believe this proof dates back to Kermack and McKendrick in 1927). That is, the only way the network topology matters is in a single-valued summary statistic, the first eigenvalue. Pretty incredible.

The authors of the present paper show that this is a general property. For any epidemic model in which disease spreads over a network such that, first, transmissions are independent across neighbors, and second, one can only enter the exposed or infected state from an exposed or infected neighbor, the general property is the same: the disease spreads through the population if the first eigenvalue of the adjacency matrix is larger than a constant which depends only on model parameters and not on the topology of the network (and, in fact, these parameters are easy to characterize). It is a particularly nice proof. First we compute the probabilities of transitioning from each state to any other. This gives us a discrete-time nonlinear dynamic system. Such systems are asymptotically stable if all real eigenvalues of the nonlinear dynamic are less than one in absolute value. If there are no infections at all, the steady state is just the steady state of a Markov chain: only infected or exposed people can infect me, so the graph structure doesn’t matter if we assume no infections, and transition between the susceptible and vaccinated states are just Markov by assumption. We then note that the Jacobian has a nice block structure which limits the eigenvalues to being one of two types, show that the first type of eigenvalues are always less than one in absolute value, then show that the second types are less than one if and only if a property depending on model parameters only are satisfied; this property has nothing to do with the network topology.

The result tells you some interesting things as well. For example, say you wish to stop the spread of an epidemic. Should you immunize people with many friends? No – you should immunize the person who lowers the first eigenvalue of the adjacency matrix the most. This result is independent of the actual network topology or the properties of the disease (how long it incubates, how fast it transmits, how long people stay sick, how likely they are to develop natural immunity, etc.). Likewise, in the opposite problem, if you wish an innovation to diffuse through a society, how should you organize conferences or otherwise create a network? Create links between people or locations such that the first eigenvalue of the adjacency matrix increases by the highest amount. Again, this is independent of the current network topology or the properties of the particular invention you wish to diffuse. Nice.

Final conference paper from ICDM2011. (No IDEAS version).

“What Determines Productivity,” C. Syverson (2011)

Chad Syverson, along with Nick Bloom, John van Reenen, Pete Klenow and many others, has been at the forefront of a really interesting new strand of the economics literature: persistent differences in productivity. Syverson looked at productivity differences within 4-digit SIC industries in the US (quite narrow industries like “Greeting Cards” or “Industrial Sealants”) a number of years back, and found that in the average industry, the 90-10 ratio of total factor productivity plants was almost 2. That is, the top decile plant in the average industry produced twice as much output as the bottom decline plant, using exactly the same inputs! Hsieh and Klenow did a similar exercise in China and India and found even starker productivity differences, largely due a big left-tail of very low productivity firms. This basic result is robust to different measures of productivity, and to different techniques for identifying differences; you can make assumptions which let you recover a Solow residual directly, or run a regression (adjusting for differences in labor and capital quality, or not), or look at deviations like firms having higher marginal productivity of labor than the wage rate, etc. In the paper discussed in the post, Syverson summarizes the theoretical and empirical literature on persistent productivity differences.

Why aren’t low productivity firms swept from the market? We know from theory that if entry is allowed, potentially infinite and instantaneous, then no firm can remain which is less productive than the entrants. This suggests that persistence of inefficient firms must result from either limits on entry, limits on expansion by efficient firms, or non-immediate efficiency because of learning-by-doing or similar (a famous study by Benkard of a Lockwood airplane showed that a plant could produce a plane with half the labor hours after producing 30, and half again after producing 100). Why don’t inefficient firms already in the market adopt best practices? This is related to the long literature on diffusion, which Syverson doesn’t cover in much detail, but essentially it is not obvious to a firm whether a “good” management practice at another firm is actually good or not. Everett Rogers, in his famous “Diffusion of Innovations” book, refers to a great example of this from Peru in the 1950s. A public health consultant was sent for two years to a small village, and tried to convince the locals to boil their water before drinking it. The water was terribly polluted and the health consequences of not boiling were incredible. After two years, only five percent of the town adopted the “innovation” of boiling. Some didn’t adopt because it was too hard, many didn’t adopt because of a local belief system that suggested only the already-sick ought drink boiled water, some didn’t adopt because they didn’t trust the experience of the advisor, et cetera. Diffusion is difficult.

Ok, so given that we have inefficient firms, what is the source of the inefficiency? It is difficult to decompose all of the effects. Learning-by-doing is absolutely relevant in many industries – we have plenty of evidence on this count. Nick Bloom and coauthors seem to suggest that management practices play a huge role. They have shown clear correlation between “best practice” management and high TFP across firms, and a recent randomized field experiment in India (discussed before on this site) showed massive impacts on productivity from management improvements. Regulation and labor/capital distortions also appear to play quite a big role. On this topic, James Schmitz wrote a very interesting paper, published in 2005 in the JPE, on iron ore producers. TFP in Great Lakes ore had been more or less constant for many decades, with very little entry or foreign competition until the 1980s. Once Brazil began exporting ore to the US, labor productivity doubled within a handful of years, and capital and total factor productivity also soared. A main driver of the change was more flexible workplace rules.

Final version in 2011 JEP (IDEAS version). Syverson was at Kellogg recently presenting a new paper of his, with an all-star cast of coauthors, on the medical market. It’s well worth reading. Medical productivity is similarly heterogeneous, and since the medical sector is coming up on 20% of GDP, the sources of inefficiency in medicine are particularly important!

%d bloggers like this: