Category Archives: Sequential Invention

“Patents as a Spur to Subsequent Innovation: Evidence from Pharmaceuticals,” D. Gilchrist (2016)

Many economists of innovation are hostile to patents as they currently stand: they do not seem to be important drivers of R&D in most industries, the market power they lead to generates substantial deadweight loss, the legal costs around enforcing patents are incredible, and the effect on downstream innovation can be particularly harmful. The argument for patents seems most clear cut in industries where the invention requires large upfront fixed costs of R&D that are paid only by the first inventor, where the invention is clearly delineated, where novelty is easy to understand, and where alternative means of inducing innovation (such as market power in complementary markets, or a large first mover advantage) do not exist. The canonical example of an industry of this type is pharma.

Duncan Gilchrist points out that the market power a patentholder obtains also affects the rents of partial substitutes which might be invented later. Imagine there is a blockbuster statin on patent. If I invent a related drug, the high price of the existing patented drug means I can charge a fairly high price too. If the blockbuster drug were off patent, though, my competitors would be generics whose low price would limit how much I can charge. In other words, the “effective” patent strength in terms of the markup I can charge depends on whether alternatives to my new drug are on patent or are generic. Therefore, the profits I will earn from my drug will be lower when alternative generics exist, and hence my incentive to pay a fixed cost to create the new drug will also be lower.

What does this mean for welfare? A pure “me-too” imitation drug, which generates very little social value compared to the existing patented drug, will never enter if its class is going to see generics in a few years anyway; profits will be competed down to zero. That same drug might find it worthwhile to pay a fixed cost of invention and earn duopoly profits if the existing on patent alternative had many years of patent protection remaining. On the other hand, a drug so much better than existing drugs that even at the pure monopoly price most consumers would prefer it to the existing alternative priced at marginal cost will be developed no matter what, since it faces no de facto restriction on its markup from whether the alternatives in its drug class are generics or otherwise. Therefore, longer patent protection from existing drugs increases entry of drugs in the same class, but mainly those that are only a bit better than existing drugs. This may be better or worse for welfare: there is a wasteful costs of entering with a drug only slightly better than what exists (the private return includes the business stealing, while social welfare doesn’t), but there are also lower prices and perhaps some benefit from variety.

I should note a caveat that really should have been noted in the existing model: changes in de facto patent length for the first drug in class also affect the entry decision of that drug. Longer patent protection may actually cause shorter effective monopoly by inducing entry of imitators! This paper is mainly empirical, so no need for a full Aghion Howitt ’92 model of creative destruction, but it is at least worth noting that the welfare implications of changes in patent protection are somewhat misstated because of this omission.

Empirically, Gilchrist shows clearly that the beginning of new clinical trials for drugs falls rapidly as the first drug in their class has less time remaining on patent: fear of competition with generic partial substitutes dulls the incentive to innovate. The results are clear in straightforward scatterplots, but there is also an IV, to help confirm the causal interpretation, using the gap between the first potentially-defensive patent on the fulcrum patent of the eventual drug, and the beginning of clinical trials, a gap that is driven by randomness in things like unexpected delays in in-house laboratory progress. Using the fact that particularly promising drugs get priority FDA review, Gilchrist also shows that these priority review entrants do not seem to be worried at all about competition from generic substitutes: the “me-too” type of drugs are the ones for whom alternatives going off patent is most damaging to profits.

Final published version in AEJ: Applied 8(4) (No RePEc IDEAS version). Gilchrist is a rare example of a well published young economist working in the private sector; he has a JPE on social learning and a Management Science on behavioral labor in addition to the present paper, but works at robo-investor Wealthfront. In my now six year dataset of the economics job market (which I should discuss again at some point), roughly 2% of “job market stars” wind up outside academia. Budish, Roin and Williams used the similar idea of investigating the effect of patents of innovation by taking advantage of the differing effective patent length drugs for various maladies get as a result of differences in the length of clinical trials following the patent grant. Empirical work on the effect of patent rules is, of course, very difficult since de jure patent strength is very similar in essentially every developed country and every industry; taking advantage of differences in de facto strength is surely a trick that will be applied more broadly.

“Competition, Imitation and Growth with Step-by-Step Innovation,” P. Aghion, C. Harris, P. Howitt, & J. Vickers (2001)

(One quick PSA before I get to today’s paper: if you happen, by chance, to be a graduate student in the social sciences in Toronto, you are more than welcome to attend my PhD seminar in innovation and entrepreneurship at the Rotman school which begins on Wednesday, the 7th. I’ve put together a really wild reading list, so hopefully we’ll get some very productive discussions out of the course. The only prerequisite is that you know some basic game theory, and my number one goal is forcing the economists to read sociology, the sociologists to write formal theory, and the whole lot to understand how many modern topics in innovation have historical antecedents. Think of it as a high-variance cross-disciplinary educational lottery ticket! If interested, email me at for more details.)

Back to Aghion et al. Let’s kick off 2015 with one of the nicer pieces to come out the ridiculously productive decade or so of theoretical work on growth put together by Philippe Aghion and his coauthors; I wish I could capture the famous alacrity of Aghion’s live presentation of his work, but I fear that’s impossible to do in writing! This paper is based around writing a useful theory to speak to two of the oldest questions in the economics of innovation: is more competition in product markets good or bad for R&D, and is there something strange about giving a firm IP (literally a grant of market power meant to spur innovation via excess rents) at the same time as we enforce antitrust (generally a restriction on market power meant to reduce excess rents)?

Aghion et al come to a few very surprising conclusions. First, the Schumpeterian idea that firms with market power do more R&D is misleading because it ignores the “escape the competition” effect whereby firms have high incentive to innovate when there is a large market that can be captured by doing so. Second, maximizing that “escape the competition” motive may involve making it not too easy to catch up to market technological leaders (by IP or other means). These two theoretical results imply that antitrust (making sure there are a lot of firms competing in a given market, spurring new innovation to take market share from rivals) and IP policy (ensuring that R&D actually needs to be performed in order to gain a lead) are in a sense complements! The fundamental theoretical driver is that the incentive to innovate depends not only on the rents of an innovation, but on the incremental rents of an innovation; if innovators include firms that already active in an industry, policy that makes your current technological state less valuable (because you are in a more competitive market, say) or policy that makes jumping to a better technological state more valuable both increase the size of the incremental rent, and hence the incentive to perform R&D.

Here are the key aspects of a simplified version of the model. An industry is a duopoly where consumers spend exactly 1 dollar per period. The duopolists produce partially substitutable goods, where the more similar the goods the more “product market competition” there is. Each of the duopolists produces their good at a firm-specific cost, and competes in Bertrand with their duopoly rival. At the minimal amount of product market competition, each firm earns constant profit regardless of their cost or their rival’s cost. Firms can invest in R&D which gives some flow probability of lowering their unit cost. Technological laggards sometimes catch up to the unit cost of leaders with exogenous probability; lower IP protection (or more prevalent spillovers) means this probability is higher. We’ll look only at features of this model in the stochastic distribution of technological leadership and lags which is a steady state if there infinite duopolistic industries.

In a model with these features, you always want at least a little competition, essentially for Arrow (1962) reasons: the size of the market is small when market power is large because total unit sales are low, hence the benefit of reducing unit costs is low, hence no one will bother to do any innovation in the limit. More competition can also be good because it increases the probability that two firms are at similar technological levels, in which case each wants to double down on research intensity to gain a lead. At very high levels of competition, the old Schumpeterian story might bind again: goods are so substitutable that R&D to increase rents is pointless since almost all rents are competed away, especially if IP is weak so that rival firms catch up to your unit cost quickly no matter how much R&D you do. What of the optimal level of IP? It’s always best to ensure IP is not too strong, or that spillovers are not too weak, because the benefit of increased R&D effort when firms are at similar technological levels following the spillover exceeds the lost incentive to gain a lead in the first place when IP is not perfectly strong. When markets are really competitive, however, the Schumpeterian insight that some rents need to exist militates in favor of somewhat stronger IP than in less competitive product markets.

Final working paper (RePEc IDEAS) which was published in 2001 in the Review of Economic Studies. This paper is the more detailed one theoretically, but if all of the insight sounds familiar, you may already know the hugely influential follow-up paper by Aghion, Bloom, Blundell, Griffith and Howitt, “Competition and Innovation: An Inverted U Relationship”, published in the QJE in 2005. That paper gives some empirical evidence for the idea that innovation is maximized at intermediate values of product market competition; the Schumpeterian “we need some rents” motive and the “firms innovate to escape competition” motive both play a role. I am actually not a huge fan of this paper – as an empirical matter, I’m unconvinced that most cost-reducing innovation in many industries will never show up in patent statistics (principally for reasons that Eric von Hippel made clear in The Sources of Innovation, which is freely downloadable at that link!). But this is a discussion for another day! One more related paper we have previously discussed is Goettler and Gordon’s 2012 structural work on processor chip innovation at AMD and Intel, which has a very similar within-industry motivation.

“Upstream Innovation and Product Variety in the U.S. Home PC Market,” A. Eizenberg (2014)

Who benefits from innovation? The trivial answer would be that everyone weakly benefits, but since innovation can change the incentives of firms to offer different varieties of a product, heterogeneous tastes among buyers may imply that some types of innovation makes large groups of people worse off. Consider computers, a rapidly evolving technology. If Lenovo introduces a laptop with a faster processor, they may wish to discontinue production of a slower laptop, because offering both types flattens the demand curve for each, and hence lowers the profit-maximizing markup that can be charged for the better machine. This effect, combined with a fixed cost of maintaining a product line, may push firms to offer too little variety in equilibrium.

As an empirical matter, however, things may well go the other direction. Spence’s famous product selection paper suggests that firms may produce too much variety, because they don’t take into account that part of the profit they earn from a new product is just cannibalization of other firm’s existing product lines. Is it possible to separate things out from data? Note that this question has two features that essentially require a structural setup: the variable of interest is “welfare”, a completely theoretical concept, and lots of the relevant numbers like product line fixed costs are unobservable to the econometrician, hence they must be backed out from other data via theory.

There are some nice IO tricks to get this done. Using a near-universe of laptop sales in the early 2000s, Eizenberg estimates heterogeneous household demand using standard BLP-style methods. Supply is tougher. He assumed that firms get a fixed cost per product line shock, then pick their product mix each quarter, then observe consumer demand, then finally play Nash-Bertrand differentiated product pricing. The problem is that the pricing game often has multiple equilibria (e.g., with two symmetric firms, one may offer a high-end product and the other a low-end one, or vice versa). Since the pricing game equilibria are going to be used to back out fixed costs, we are in a bit of a bind. Rather than select equilibria using some ad hoc approach (how would you even do so in the symmetric case just mentioned?), Eizenberg cleverly just partially identifies fixed costs as backed out from any possible pricing game equilibrium, using bounds in the style of Pakes, Porter, Ho and Ishii. This means that welfare effects are also only partially identified.

Throwing this model at the PC data shows that the mean consumer in the early 2000s wasn’t willing to pay any extra for a laptop, but there was a ton of heterogeneity in willingness to pay both for laptops and for faster speed on those laptops. Every year, the willingness to pay for a given computer fell $257 – technology was rapidly evolving and lots of substitute computers were constantly coming onto the market.

Eizenberg uses these estimates to investigate a particularly interesting counterfactual: what was the effect of the introduction of the lighter Pentium M mobile processor? As Pentium M was introduced, older Pentium III based laptops were, over time, no longer offered by the major notebook makers. The M raised predicted notebook sales by 5.8 to 23.8%, raised mean notebook price by $43 to $86, and lowered Pentium III share in the notebook market from 16-23% down to 7.7%. Here’s what’s especially interesting, though: total consumer surplus is higher with the M available, but all of the extra consumer surplus accrues to the 20% least price-sensitive buyers (as should be intuitive, since only those with high willingness-to-pay are buying cutting edge notebooks). What if a social planner had forced firms to keep offering the Pentium III models after the M was introduced? Net consumer plus producer surplus may have actually been positive, and the benefits would have especially accrued to those at the bottom end of the market!

Now, as a policy matter, we are (of course) not going to force firms to offer money-losing legacy products. But this result is worth keeping in mind anyway: because firms are concerned about pricing pressure, they may not be offering a socially optimal variety of products, and this may limit the “trickle-down” benefits of high tech products.

2011 working paper (No IDEAS version). Final version in ReStud 2014 (gated).

“How do Patents Affect Follow-On Innovation: Evidence from the Human Genome,” B. Sampat & H. Williams (2014)

This paper, by Heidi Williams (who surely you know already) and Bhaven Sampat (who is perhaps best known for his almost-sociological work on the Bayh-Dole Act with Mowery), made quite a stir at the NBER last week. Heidi’s job market paper a few years ago, on the effect of openness in the Human Genome Project as compared to Celera, is often cited as an “anti-patent” paper. Essentially, she found that portions of the human genome sequenced by the HGP, which placed their sequences in the public domain, were much more likely to be studied by scientists and used in tests than portions sequenced by Celera, who initially required fairly burdensome contractual steps to be followed. This result was very much in line with research done by Fiona Murray, Jeff Furman, Scott Stern and others which also found that minor differences in openness or accessibility can have substantial impacts on follow-on use (I have a paper with Yasin Ozcan showing a similar result). Since the cumulative nature of research is thought to be critical, and since patents are a common method of “restricting openness”, you might imagine that Heidi and the rest of these economists were arguing that patents were harmful for innovation.

That may in fact be the case, but note something strange: essentially none of the earlier papers on open science are specifically about patents; rather, they are about openness. Indeed, on the theory side, Suzanne Scotchmer has a pair of very well-known papers arguing that patents effectively incentivize cumulative innovation if there are no transaction costs to licensing, no spillovers from sequential research, and no incentive for early researchers to limit licenses in order to protect their existing business (consider the case of Farnsworth and the FM radio), and if potential follow-on innovators can be identified before they sink costs. That is a lot of conditions, but it’s not hard to imagine industries where inventions are clearly demarcated, where holders of basic patents are better off licensing than sitting on the patent (perhaps because potential licensors are not also competitors), and where patentholders are better off not bothering academics who technically infringe on their patent.

What industry might have such characteristics? Sampat and Williams look at gene patents. Incredibly, about 30 percent of human genes have sequences that are claimed under a patent in the United States. Are “patented genes” still used by scientists and developers of medical diagnostics after the patent grant, or is the patent enough of a burden to openness to restrict such use? What is interesting about this case is that the patentholder generally wants people to build on their patent. If academics find some interesting genotype-phenotype links based on their sequence, or if another firm develops a disease test based on the sequence, there are more rents for the patentholder to garner. In surveys, it seems that most academics simply ignore patents of this type, and most gene patentholders don’t interfere in research. Anecdotally, licenses between the sequence patentholder and follow-on innovators are frequent.

In general, it is really hard to know whether patents have any effect on anything, however; there is very little variation over time and space in patent strength. Sampat and Williams take advantage of two quasi-experiments, however. First, they compare applied-for-but-rejected gene patents to applied-for-but-granted patents. At least for gene patents, there is very little difference in terms of measurables before the patent office decision across the two classes. Clearly this is not true for patents as a whole – rejected patents are almost surely of worse quality – but gene patents tend to come from scientifically competent firms rather than backyard hobbyists, and tend to have fairly straightforward claims. Why are any rejected, then? The authors’ second trick is to look directly at patent examiner “leniency”. It turns out that some examiners have rejection rates much higher than others, despite roughly random assignment of patents within a technology class. Much of the difference in rejection probability is driven by the random assignment of examiners, which justifies the first rejected-vs-granted technique, and also suggested an instrumental variable to further investigate the data.

With either technique, patent status essentially generates no difference in the use of genes by scientific researchers and diagnostic test developers. Don’t interpret this result as turning over Heidi’s earlier genome paper, though! There is now a ton of evidence that minor impediments to openness are harmful to cumulative innovation. What Sampat and Williams tell us is that we need to be careful in how we think about “openness”. Patents can be open if the patentholder has no incentive to restrict further use, if downstream innovators are easy to locate, and if there is no uncertainty about the validity or scope of a patent. Indeed, in these cases the patentholder will want to make it as easy as possible for follow-on innovators to build on their patent. On the other hand, patentholders are legally allowed to put all sorts of anti-openness burdens on the use of their patented invention by anyone, including purely academic researchers. In many industries, such restrictions are in the interest of the patentholder, and hence patents serve to limit openness; this is especially true where private sector product development generates spillovers. Theory as in Scotchmer-Green has proven quite correct in this regard.

One final comment: all of these types of quasi-experimental methods are always a bit weak when it comes to the extensive margin. It may very well be that individual patents do not restrict follow-on work on that patent when licenses can be granted, but at the same time the IP system as a whole can limit work in an entire technological area. Think of something like sampling in music. Because all music labels have large teams of lawyers who want every sample to be “cleared”, hip-hop musicians stopped using sampled beats to the extent they did in the 1980s. If you investigated whether a particular sample was less likely to be used conditional on its copyright status, you very well might find no effect, as the legal burden of chatting with the lawyers and figuring out who owns what may be enough of a limit to openness that musicians give up samples altogether. Likewise, in the complete absence of gene patents, you might imagine that firms would change their behavior toward research based on sequenced genes since the entire area is more open; this is true even if the particular gene sequence they want to investigate was unpatented in the first place, since having to spend time investigating the legal status of a sequence is a burden in and of itself.

July 2014 Working Paper (No IDEAS version). Joshua Gans has also posted a very interesting interpretation of this paper in terms of Coasean contractability.

“Patents and Cumulative Innovation: Causal Evidence from the Courts,” A. Galasso & M. Schankerman (2013)

Patents may increase or hinder cumulative invention. On the one hand, a patentholder can use his patent to ensure that downstream innovators face limited competition and thus have enough rents to make it worthwhile developing their product. On the other hand, holdup and other licensing difficulties have been shown in many theoretical models to make patents counterproductive. Galasso and Schankerman use patent invalidation trials to try and separate out the effect, and the broad strokes of the theory appear to hold up: on average, patents do limit follow-up invention, but this limitation appears to solely result from patents held by large firms, used by small firms, in technologically complex areas without concentrated power.

The authors use a clever IV to generate this result. The patent trials they look at involve three judges, selected at random. Looking at other cases the individual judges have tried, we can estimate the proclivity to strike down a patent for a given judge, and thus predict the probability a certain panel in the future will strike down a certain patent. That is, the proclivity of the judges to strike down the patent is a nice IV for whether the patent is actually struck down. In the second stage of the IV, investigate how this predicted probability of being invalidated, along with covariates and the pre-trial citation path, impact post-trial citations. And the impact is large: on average, citations increase 50% following an invalidation (and indeed, the Poisson IV estimate mentioned in a footnote, which seems more justified econometrically to me, is even larger).

There is, however, substantial heterogeneity. Estimating a marginal treatment effect (using a trick of Heckman and Vycatil’s) suggests the biggest impact of invalidation on patents whose unobservables make them less likely to be overturned. To investigate this heterogeneity further, the authors run their regressions again including measures of technology class concentration (what % of patents in a given subclass come from the top few patentees) and industry complexity (using the Levin survey). They also denote how many patents the patentee involved in the trial received in the years around the trial, as well as the number of patents received by those citing the patentee. The harmful effect of patents on future citations appears limited to technology classes with relatively low concentration, complex classes, large firms with the invalidated patent, and small firms doing the citing. These characteristics all match well with the type of technologies theory imagines to be linked to patent thickets, holdup potential or high licensing costs.

In the usual internal validity/external validity way, I don’t know how broadly these results generalize: even using the judges as an IV, we are still deriving treatment effects conditional on the patent being challenged in court and actually reaching a panel decision concerning invalidation; it seems reasonable to believe that the mere fact a patent is being challenged is evidence that licensing is problematic, and the mere fact that a settlement was not reached before trial even more so. The social welfare impact is also not clear to me: theory suggests that even when patents are socially optimal for cumulative invention, the primary patentholder will limit licensing to a small number of firms in order to protect their rents, hence using forward citations as a measure of cumulative invention allows no way to separate socially optimal from socially harmful limits. But this is at least some evidence that patents certainly don’t democratize invention, and that result fits squarely in with a growing literature on the dangers of even small restrictions on open science.

August 2013 working paper (No IDEAS version).

“Path Dependence,” S. Page (2006)

When we talk about strategic equilibrium, we can talk in a very formal sense, as many refinements with their well-known epistemic conditions have been proposed, the nature of uncertainty in such equilibria has been completely described, the problems of sequential decisionmaking are properly handled, etc. So when we do analyze history, we have a useful tool to describe how changes in parameters altered the equilibrium incentives of various agents. Path dependence, the idea that past realizations of history matter (perhaps through small events, as in Brian Arthur’s work) is widespread. A typical explanation given is increasing returns. If I buy a car in 1900, I make you more likely to buy a car in 1901 by, at the margin, lowering the production cost due to increasing returns to scale or lowering the operating cost by increasing incentives for gas station operators to operate.

This is quite informal, though; worse, the explanation of increasing returns is neither necessary nor sufficient for history-dependence. How can this be? First, consider that “history-dependence” may mean (at least) six different things. History can effect either the path of history, or its long-run outcome. For example, any historical process satisfying the assumptions of the ergodic theorem can be history-dependent along a path, yet still converge to the same state (in the network diffusion paper discussed here last week, a simple property of the network structure tells me whether an epidemic will diffuse entirely in the long-run, but the exact path of that eventual diffusion clearly depends on something much more complicated). We may believe, for instance, that the early pattern of railroads affected the path of settlement of the West without believing that this pattern had much consequence for the 2010 distribution of population in California. Next, history-dependence in the long-run or short-run can depend either on a state variable (from a pre-defined set of states), the ordered set of past realizations, or the unordered set of past realizations (the latter called path and phat dependence, respectively, since phat dependence does not depend on order). History matters in elections due to incumbent bias, but that history-dependence can basically be summed up by a single variable denoting who is the current incumbent, omitting the rest of history’s outcomes. Phat dependence is likely in simple technology diffusion: I adopt a technology as a function of which of my contacts has adopted it, regardless of the order in which they adopted. Path dependence comes up, for example, in models of learning following Aumann and Geanakoplos/Polemarchakis, consensus among a group can be broken if agents do not observe the time at which messages were sent between third parties.

Now consider increasing returns. For which types of increasing returns is this necessary or sufficient? It turns out the answer is, for none of them! Take again the car example, but assume there are three types of cars in 1900, steam, electric and gasoline. For the same reasons that gas-powered cars had increasing returns, steam and electric cars do as well. But the relative strength of the network effect for gas-powered cars is stronger. Page thinks of this as a biased Polya process. I begin with five balls, 3 G, 1 S and 1 E, in an urn. I draw one at random. If I get an S or an E, I return it to the urn with another ball of the same type (thus making future draws of that type more common, hence increasing returns). If I draw a G, I return it to the urn along with 2t more G balls, where t is the time which increments by 1 after each draw. This process converges to having arbitrarily close to all balls of type G, even though S and E balls also exhibit increasing returns.

Why about the necessary condition? Surely, increasing returns are necessary for any type of history-dependence? Well, not really. All I need is some reason for past events to increase the likelihood of future actions of some type, in any convoluted way I choose. One simple mechanism is complementarities. If A and B are complements (adopting A makes B more valuable, and vice versa), while C and D are also complements, then we can have the following situation. An early adoption of A makes B more valuable, increasing the probability of adopting B the next period which itself makes future A more valuable, increasing the probability of adopting A the following period, and so on. Such reasoning is often implicit in the rhetoric linking market-based middle class to a democratic political process: some event causes a private sector to emerge, which increases pressure for democratic politics, which increases protection of capitalist firms, and so on. As another example, consider the famous QWERTY keyboard, the best-known example of path dependence we have. Increasing returns – that is, the fact that owning a QWERTY keyboard makes this keyboard more valuable for both myself and others due to standardization – is not sufficient for killing the Dvorak or other keyboards. This is simple to see: the fact that QWERTY has increasing returns doesn’t mean that the diffusion of something like DVD players is history-dependent. Rather, it is the combination of increasing returns for QWERTY and a negative externality on Dvorak that leads to history-dependence for Dvorak. If preferences among QWERTY and Dvorak are Leontief, and valuations for both have increasing returns, then I merely buy the keyboard I value highest – this means that purchases of QWERTY by others lead to QWERTY lock-in by lowering the demand curve for Dvorak, not merely by raising the demand curve for QWERTY. (And yes, if you are like me and were once told to never refer to effects mediated by the market as “externalities”, you should quibble with the vocabulary here, but the point remains the same.)

All in all interesting, and sufficient evidence that we need a better formal theory and taxonomy of history dependence than we are using now.

Final version in the QJPS (No IDEAS version). The essay is written in a very qualitative/verbal manner, but more because of the audience than the author. Page graduated here at MEDS, initially teaching at Caltech, and his CV lists quite an all-star cast of theorist advisers: Myerson, Matt Jackson, Satterthwaite and Stanley Reiter!

“Did AMD Spur Intel to Innovate More?,” R. Goettler & B. Gordon (2011)

The relation between competition and innovation is theoretically ambiguous. On the one hand, as Schumpeter pointed out, having market power allows you to recover rents from new product sales, so you might expect monopolies to innovate more. On the other hand, innovation is costly, so without competitive pressure, you may simply rest on your laurels and keep selling your old product.

Goettler and Gordon, in a recent JPE, use the Intel/AMD microprocessor competition to investigate this issue. Innovation is easy to measure here – we simply look at the processor speed at the frontier for each firm, and avoid any messy issues about the difference between patented inventions and “actual” inventions. We can also track for over a decade the price differences in each firm’s top chips, the speed differences, and the response. The market is also for all practical purposes a duopoly with very little attempted entry. Computers possess another interesting property, in that they are durable goods. Past products compete with future sales. You may wish to keep prices high when you have market power this period in order not to cannibalize future sales if you expect a good innovation to appear next period for which you can charge even higher prices. Many sectors of the economy involve durable goods, of course.

The authors use a simple model to estimate consumer preferences in a structural model with spillovers (it is harder to push the frontier than to catch up). They find that, if Intel had a monopoly, innovation would have been 4% faster, but consumer surplus would have been 4% lower due to the higher prices charged by Intel, which is the standard Schumpeterian tradeoff. They find consumer surplus is maximized in a world where Intel has some anticompetitive power, though not monopoly power. The reason is that monopoly firms in durable goods markets still need to innovate because of competition with their old products, whereas duopolists can only earn rents to cover R&D costs if the two firms are selling different technologies. There are a number of interesting comparative statics as well. If spillovers are nonexistent, then the two firms race until one has a sufficiently large technological lead, at which point the other firm gives up, and no more innovation takes place, while if spillovers are large, the returns to each firm from doing R&D are low. In both cases, monopolists in a durable goods market innovate more. If spillovers are of an intermediate level, then duopolists will innovate more. As the authors note, “such variation might be one reason cross-industry studies have difficulty identifying robust relationships.”

The estimation involves some technical difficulties which may interest the Pakes-style IO readers. I am not an IO guy myself, so perhaps a reader can comment as to the more general style of this sort of paper. While I find the theory interesting, and am impressed by the difficulty of the empirical estimation, what exactly is the value of this sort of estimation? We know from theory the important qualitative tradeoffs. The style of estimation here can really only be done ex-post – the methods here could not be used, for example, to identify contemporaneously whether a anticompetitive behavior in a particularly durable goods industry is harmful for social welfare. I don’t mean to single this paper out, as this comment applies to a huge number of IO articles. (Final working paper)

“Patent Reform: Aligning Reward and Contribution,” C. Shapiro (2007)

Carl Shapiro, in addition to being a bigshot in the academic study of invention, is also a member of Obama’s Council of Economic Advisers. I’m not sure how much of a role he had in advising on the Leahy-Smith patent reform act that was passed last year, but many of the reforms seem to come directly from this NBER Working Paper, so I imagine his role was a big one.

Most academic economists working on IP-related issues think, for a variety of reasons, that IP is currently far stronger than the optimal level. Indeed, many would prefer a world with no patents and copyrights at all to the current system. But let’s take the simplest possible reform: if the social benefit granted by a patent exceeds the social value created by the invention, we ought limit the strength of the patent. You might wonder, how is it even possible for the patentholder to gain more than the social value of his invention? A standard monopolist with a patent still creates consumer surplus and some deadweight loss – that is, social value not captured by the inventor – unless the monopolist is perfectly price discriminating. Shapiro, drawing on a number of earlier papers, gives three nice examples where return to the patentholder exceeds social value. Unless otherwise noted, we assume there is zero deadweight loss created by the patent; if there is deadweight loss, the reason for weakening the patent is even stronger.

First, we know from Loury (1979) and Tandon (1983) that if a patent gives the first firm to invent the full social value of his invention, there will be too much effort expended trying to win that prize; when each firm is deciding whether to expend more effort on R&D, they do not take into account that their increased effort lowers the probability of winning for the other firm. Tandon shows that this “patent race” effect is particularly strong for inventions that are relatively cheap to produce, such as those that are close to obvious. One way to fix this problem somewhat is to allow a second firm who independently invents at roughly the same time as the first firm to invent to sell the product without needing a license. That is, if a product is easy to invent, and two firms expend a lot of effort on it in an attempt to win the patent race, the second firm’s effort is not a total social waste since it may lead to a second independent invention, turning the eventual monopoly (with high deadweight loss) into a duopoly (with lower deadweight loss). Many economists and legal scholars have proposed allowing an independent inventor exception, but Congress has thus far shown no interest in taking up this idea. This is perhaps no surprise: Congress refused to pass the Public Domain Enhancement Act a few years back, an IP-related law that is as big a free lunch as you will ever see.

Second, probabilistic patents are often not challenged. Imagine a patent that, if challenged in court, has a 30% chance of being upheld as valid; many such weak patents exist. Assume that is totally free to challenge the patent, meaning there are no legal or transaction costs. Shapiro shows the following example, drawing on a paper of his with Joseph Farrell. Let a patent with probability .3 of being upheld when challenged be licensed to an oligopolistic downstream industry. The patent adds $10 of value to the products of all downstream inventors, so if the license royalty is greater than $3, the patentholder is earning more than the expected value of his patent. Imagine a royalty of $6. If I challenge the patent in court and my rival does not, then when I win the challenge, I and my rival in the downstream product market are both able to use the invention without paying any license fee, hence our costs are the same, and hence winning the challenge does not earn me any more profits due to competition with my rival. If I lose the challenge, then my rival pays a royalty of only $6, whereas I will have to pay $10 for each unit where I infringe, and hence I will be at a disadvantage in the downstream market. Therefore, neither firm will challenge the patent in equilibrium, and the inventor will earn more than his true social contribution.

Third, hold-up, particularly in the form of the “patent ambush,” can lead to excess returns. Imagine I can sell my product with noninfringing design A at a price of 100 dollars, or with infringing design B, for which I will need to license a previous invention, at a price of 120 dollars. The patent thus increases the value of my product by $20. If I Nash bargain with the inventor, we will split the gains from using his invention in my product, and therefore I will pay $10 to use the invention, and earn $110 per unit by producing design B. This intuition is very different if I first make investments, then learn about the patent. Imagine A and B both require 40 dollars of fixed cost per unit, each, to design. If I don’t know about the patent, I will design product B, and plan to earn 80 dollars per unit. The patentholder will then come to me and tell me I need a license or he will sue for infringement. Once the fixed cost of B is sunk, the surplus from obtaining a license is 20+40=60 dollars, since not obtaining a license means I will need to produce A, which costs another 40 dollars and sells for 20 dollars than design B. So a Nash bargaining outcome is that I pay 30 dollars for the license and produce B. That is, the patentholder can use holdup to extract extra rents after I have made specific investments.

One way to fix the last two problems is to allow informal post-grant challenges to patents, perhaps by third parties. This makes weak patents in important industries less likely to cause hold-up after specific investment, and also limits the ability of third parties to take advantage of the reluctance of licensees to challenge once license terms have been established. The new patent reform does vastly increase the scope for post-grant review.

What’s too bad about the 2011 patent reform is that the types of examples provided by Shapiro above are only the most clean-cut, overwhelmingly obvious ways to improve the efficiency of the patent system. They don’t even pretend to approach what would be necessary for an optimal IP regime. Aside from a handful of congressmen, (Zoe Lofgren and Ron Wyden on the democratic side, or Jason Chaffetz on the Republican side, among them) Congress is filled with IP maximalists. For the sake of social welfare, it’s too bad.

May 2007 NBER Working Paper (IDEAS)

Models of Innovation 2: Sequential Innovation

This post continues a series of notes on the main theoretical models of innovation. The first post covered the patent race literature. Here I’ll cover the sequential innovation literature most associated with Suzanne Scotchmer, particularly in her 1991 JEP and her 1995 RAND with Jerry Green.

Let there be two inventions instead of one, where the second builds upon the first. Let invention 1 cost c1, and invention 2 cost c2, with firm 1 having the ability to invent invention 1, and firm 2 invention 2. If only invention 1 exists, the inventing firm earns v1 (where v1 is a function of patent length T). If both invention 1 and 2 exist, and compete for sales in a market, then they earn v1c and v2c, where c stands for “compete”. If both invention 1 and 2 exist, but are sold by a monopolist, they earn v12>=v1c+v2c. With probability p, 2 will infringe on 1, and hence inventor 2 will need a license to sell product 2.

With one invention, it’s intuitive that the length of the patent should be just long enough to allow the inventor to cover the cost of that invention. This logic does not hold when inventions build on each other. Invention 1 makes invention 2 possible, so it seems we should give some of the social surplus created by invention 2 to the inventor of 1. But doing so makes it impossible to give all of the surplus created by invention 2 to inventor 2. This is a standard problem in the theory of complementary goods: if left shoe has social value 0, and right shoe by itself has social value 0, but the two together have value 1, then the “marginal value” created by each shoe is 1. Summing the marginal values created gives us 2, but the total social value of the pair of shoes is only 1. This wedge between the partial equilibrium concept of marginal value and our intuition about general equilibrium actually comes up quite a bit: willingness-to-pay, by definition, is only meaningful in a partial equilibrium sense despite frequent misuse to the contrary.

So how should efficiently give patent rights with sequential inventions? First assume there is no possibility to form an ex-ante license between the two firms, though of course firms can sell inventions to each other once the product is invented. Also assume that profits are divided using Nash bargaining when firms sell a patent to each other: in this case, each firm garners half of the profit earned using the patents minus the threat point representing what each firm earns in the absence of an agreement. Consider our logic from the one invention case, where we get incentives correct by making patent length just long enough to cover costs: v12(T)=c1+c2, where v12 is the revenue earned by having both products 1 and 2 in the same monopoly firm given patent length T, c1 is the cost of developing product 1, and c2 is the cost to develop product 2. Setting v12(T) just equal to c1+c2 will, in general, provide insufficient incentives for both products to be developed. That is, making patent length long enough that inventor 1 can afford to cover her costs, and the costs of inventor 2, while making precisely zero profit, is insufficient for inducing the invention of both 1 and 2. Why? One reason is that once 2 is invented, the development costs of 2 are sunk. Therefore, once 2 is invented, the licensing agreement will not take into account inventor 2’s costs. Inventor 2, knowing this, may be reluctant to invest in product 2 in the first place.

How might I fix this? Allow ex-ante joint ventures. That is, let firm 1 and firm 2 form a joint venture before the costs of creating invention 2 are sunk. If ex-ante joint ventures is allowed, the optimal patent breadth is p=1: the second invention always infringes. The reason is simply that longer patent length diminishes the bargaining power of firm 2 at the stage in the game where the joint venture is created, 2 knows that he will be required to get an ex-post license after inventing if no joint venture is formed. The Nash bargaining share given to 2 in an ex-post license is always higher if there is a chance that 2 does not infringe because 2’s Nash threat point is higher. Therefore, the share of monopoly profit that needs to be given in an ex-ante joint venture to 2 is higher, because this share is determined in Nash bargaining by the “threat” 2 has of not signing the joint venture agreement, developing product 2, and then signing an ex-post license agreement. Since the total surplus when both products are sold by a monopoly is a fixed amount, giving more profit to 2 means giving less profit to 1. By construction, this increased profit for 2 does change the probability that firm 2 invests in invention 2; rather, the distortion is that less profit to 1 means less incentive for firm 1 to invest in invention 1. So optimal patent breadth is always p=1: follow-up inventions should always infringe.

The intuition above has been modified in many papers. Scotchmer and Green themselves note that that if the value v2c of the second invention is stochastic, and only realizes after firm 2 invests, less than perfectly broad patents can be optimal. Bessen and Maskin’s 2009 RAND, discussed previously on this site, notes that imperfect information across firms about research costs can make patents strictly worse than no patents, because with patents I will only offer joint ventures that are acceptable to low-cost researchers even when social welfare maximization would require both low and high-cost researchers to work on the next invention. A coauthor here at Northwestern and I have a result, which I’ll write up here at some point, that broad patents are not optimal when we allow for multiple paths toward future inventions. Without giving away the whole plot, the basic point is that broad patents cause distortions early on – as firms race inefficiently to get the broad patent – whereas narrow patents cause distortions later – as firms inefficiently try to invent around the patent. The second problem can be fixed with licenses granted by the patentholder, but the first cannot as the distortion occurs before there is anything to license.

The main papers discussed here are Scotchmer and Green’s 1995 RAND (Final RAND copy, IDEAS) and Scotchmer’s 1991 JEP (Final JEP copy, IDEAS). Despite the dates, the JEP was written after the RAND’s original working paper.

“Collective Invention,” R. Allen (1983)

Who invents? Standard theories usually deal just with R&D-performing firms and individual inventors. But enormous amounts of invention come as a byproduct of everyday firm investment. This type of invention tends to be incremental, and tends to be neither patentable nor held secret by the inventor. Robert Allen, in this famous paper from the early 1980s, refers to such invention as “collective invention.”

Consider the British blast furnace industry in the mid 1800s. There was certainly no meaningful corporate R&D at the time, as the world’s first corporate R&D labs were only just appearing then (in the German chemical industry). Yet the blast furnace industry in Cleveland changed enormously over a couple decades, nearly doubling the height of blast furnaces, and more than doubling temperatures. Such changes were greatly beneficial for reducing fuel consumption.

No single firm made these drastic changes overnight. Rather, furnace heights were increased incrementally by some firms when they built a new factory. Benefits in terms of lower fuel use were then made publicly available through personal correspondence, industry gatherings, and journal publications. Two factors were critical in this shift. First, the industry was rapidly adding capital. If a new plant is being built, experimentation has low costs: the cost of adding a foot to the chimney is that efficiency might be harmed slightly, and the benefit is that efficiency may be helped slightly. When an industry is not accumulating capital, this sort of minor experimentation is much more costly, since the only experimentation involves building an entirely new, not-yet-necessary factory. The second critical factor is some reason to avoid secrecy. In the blast furnaces, secrecy was more or less impossible. Builders and workers were frequently moving from plant to plant, and could simply tell their new employer what they learned. Since information is leaking out anyway, it may be an equilibrium to share information in the hopes that others will have useful information to share with me: work by von Hippel, previously discussed here models this sort of sharing in more detail.

The reason we tend to ignore this type of public, incremental innovation is because of a bias, in popular culture and in policy, toward big technological advances. A paper of mine, which I hope to have ready to share here soon, argues that the patent bias toward technological achievement and away from incentivizing the nexus of inventions which lead to a commercially viable product can be seriously harmful. The importance of minor inventions is more than the importance of the famous ones, they shout from the rooftops!

An interesting update of Allen would be in the context of China. To the extent that industries accumulating capital quickly throw off, as a byproduct, incremental inventions, there can be rapidly increasing cost efficiency in even developed industries when some shock causes the industry to switch to a new region with little capital. Peter Hessler, National Geographic’s man in China and a great chronicler of that nation, tells a great story about technology transfer and incremental growth in his book Country Driving. I’m also curious to see how one would distinguish learning-by-doing in aggregate statistics from learning-by-sharing at the plant level.….collinvent.pdf (Final JEBO copy. The only nongated version I can find is this Google cached article.)

%d bloggers like this: