Category Archives: Sequential Invention

“Upstream Innovation and Product Variety in the U.S. Home PC Market,” A. Eizenberg (2014)

Who benefits from innovation? The trivial answer would be that everyone weakly benefits, but since innovation can change the incentives of firms to offer different varieties of a product, heterogeneous tastes among buyers may imply that some types of innovation makes large groups of people worse off. Consider computers, a rapidly evolving technology. If Lenovo introduces a laptop with a faster processor, they may wish to discontinue production of a slower laptop, because offering both types flattens the demand curve for each, and hence lowers the profit-maximizing markup that can be charged for the better machine. This effect, combined with a fixed cost of maintaining a product line, may push firms to offer too little variety in equilibrium.

As an empirical matter, however, things may well go the other direction. Spence’s famous product selection paper suggests that firms may produce too much variety, because they don’t take into account that part of the profit they earn from a new product is just cannibalization of other firm’s existing product lines. Is it possible to separate things out from data? Note that this question has two features that essentially require a structural setup: the variable of interest is “welfare”, a completely theoretical concept, and lots of the relevant numbers like product line fixed costs are unobservable to the econometrician, hence they must be backed out from other data via theory.

There are some nice IO tricks to get this done. Using a near-universe of laptop sales in the early 2000s, Eizenberg estimates heterogeneous household demand using standard BLP-style methods. Supply is tougher. He assumed that firms get a fixed cost per product line shock, then pick their product mix each quarter, then observe consumer demand, then finally play Nash-Bertrand differentiated product pricing. The problem is that the pricing game often has multiple equilibria (e.g., with two symmetric firms, one may offer a high-end product and the other a low-end one, or vice versa). Since the pricing game equilibria are going to be used to back out fixed costs, we are in a bit of a bind. Rather than select equilibria using some ad hoc approach (how would you even do so in the symmetric case just mentioned?), Eizenberg cleverly just partially identifies fixed costs as backed out from any possible pricing game equilibrium, using bounds in the style of Pakes, Porter, Ho and Ishii. This means that welfare effects are also only partially identified.

Throwing this model at the PC data shows that the mean consumer in the early 2000s wasn’t willing to pay any extra for a laptop, but there was a ton of heterogeneity in willingness to pay both for laptops and for faster speed on those laptops. Every year, the willingness to pay for a given computer fell $257 – technology was rapidly evolving and lots of substitute computers were constantly coming onto the market.

Eizenberg uses these estimates to investigate a particularly interesting counterfactual: what was the effect of the introduction of the lighter Pentium M mobile processor? As Pentium M was introduced, older Pentium III based laptops were, over time, no longer offered by the major notebook makers. The M raised predicted notebook sales by 5.8 to 23.8%, raised mean notebook price by $43 to $86, and lowered Pentium III share in the notebook market from 16-23% down to 7.7%. Here’s what’s especially interesting, though: total consumer surplus is higher with the M available, but all of the extra consumer surplus accrues to the 20% least price-sensitive buyers (as should be intuitive, since only those with high willingness-to-pay are buying cutting edge notebooks). What if a social planner had forced firms to keep offering the Pentium III models after the M was introduced? Net consumer plus producer surplus may have actually been positive, and the benefits would have especially accrued to those at the bottom end of the market!

Now, as a policy matter, we are (of course) not going to force firms to offer money-losing legacy products. But this result is worth keeping in mind anyway: because firms are concerned about pricing pressure, they may not be offering a socially optimal variety of products, and this may limit the “trickle-down” benefits of high tech products.

2011 working paper (No IDEAS version). Final version in ReStud 2014 (gated).

“How do Patents Affect Follow-On Innovation: Evidence from the Human Genome,” B. Sampat & H. Williams (2014)

This paper, by Heidi Williams (who surely you know already) and Bhaven Sampat (who is perhaps best known for his almost-sociological work on the Bayh-Dole Act with Mowery), made quite a stir at the NBER last week. Heidi’s job market paper a few years ago, on the effect of openness in the Human Genome Project as compared to Celera, is often cited as an “anti-patent” paper. Essentially, she found that portions of the human genome sequenced by the HGP, which placed their sequences in the public domain, were much more likely to be studied by scientists and used in tests than portions sequenced by Celera, who initially required fairly burdensome contractual steps to be followed. This result was very much in line with research done by Fiona Murray, Jeff Furman, Scott Stern and others which also found that minor differences in openness or accessibility can have substantial impacts on follow-on use (I have a paper with Yasin Ozcan showing a similar result). Since the cumulative nature of research is thought to be critical, and since patents are a common method of “restricting openness”, you might imagine that Heidi and the rest of these economists were arguing that patents were harmful for innovation.

That may in fact be the case, but note something strange: essentially none of the earlier papers on open science are specifically about patents; rather, they are about openness. Indeed, on the theory side, Suzanne Scotchmer has a pair of very well-known papers arguing that patents effectively incentivize cumulative innovation if there are no transaction costs to licensing, no spillovers from sequential research, and no incentive for early researchers to limit licenses in order to protect their existing business (consider the case of Farnsworth and the FM radio), and if potential follow-on innovators can be identified before they sink costs. That is a lot of conditions, but it’s not hard to imagine industries where inventions are clearly demarcated, where holders of basic patents are better off licensing than sitting on the patent (perhaps because potential licensors are not also competitors), and where patentholders are better off not bothering academics who technically infringe on their patent.

What industry might have such characteristics? Sampat and Williams look at gene patents. Incredibly, about 30 percent of human genes have sequences that are claimed under a patent in the United States. Are “patented genes” still used by scientists and developers of medical diagnostics after the patent grant, or is the patent enough of a burden to openness to restrict such use? What is interesting about this case is that the patentholder generally wants people to build on their patent. If academics find some interesting genotype-phenotype links based on their sequence, or if another firm develops a disease test based on the sequence, there are more rents for the patentholder to garner. In surveys, it seems that most academics simply ignore patents of this type, and most gene patentholders don’t interfere in research. Anecdotally, licenses between the sequence patentholder and follow-on innovators are frequent.

In general, it is really hard to know whether patents have any effect on anything, however; there is very little variation over time and space in patent strength. Sampat and Williams take advantage of two quasi-experiments, however. First, they compare applied-for-but-rejected gene patents to applied-for-but-granted patents. At least for gene patents, there is very little difference in terms of measurables before the patent office decision across the two classes. Clearly this is not true for patents as a whole – rejected patents are almost surely of worse quality – but gene patents tend to come from scientifically competent firms rather than backyard hobbyists, and tend to have fairly straightforward claims. Why are any rejected, then? The authors’ second trick is to look directly at patent examiner “leniency”. It turns out that some examiners have rejection rates much higher than others, despite roughly random assignment of patents within a technology class. Much of the difference in rejection probability is driven by the random assignment of examiners, which justifies the first rejected-vs-granted technique, and also suggested an instrumental variable to further investigate the data.

With either technique, patent status essentially generates no difference in the use of genes by scientific researchers and diagnostic test developers. Don’t interpret this result as turning over Heidi’s earlier genome paper, though! There is now a ton of evidence that minor impediments to openness are harmful to cumulative innovation. What Sampat and Williams tell us is that we need to be careful in how we think about “openness”. Patents can be open if the patentholder has no incentive to restrict further use, if downstream innovators are easy to locate, and if there is no uncertainty about the validity or scope of a patent. Indeed, in these cases the patentholder will want to make it as easy as possible for follow-on innovators to build on their patent. On the other hand, patentholders are legally allowed to put all sorts of anti-openness burdens on the use of their patented invention by anyone, including purely academic researchers. In many industries, such restrictions are in the interest of the patentholder, and hence patents serve to limit openness; this is especially true where private sector product development generates spillovers. Theory as in Scotchmer-Green has proven quite correct in this regard.

One final comment: all of these types of quasi-experimental methods are always a bit weak when it comes to the extensive margin. It may very well be that individual patents do not restrict follow-on work on that patent when licenses can be granted, but at the same time the IP system as a whole can limit work in an entire technological area. Think of something like sampling in music. Because all music labels have large teams of lawyers who want every sample to be “cleared”, hip-hop musicians stopped using sampled beats to the extent they did in the 1980s. If you investigated whether a particular sample was less likely to be used conditional on its copyright status, you very well might find no effect, as the legal burden of chatting with the lawyers and figuring out who owns what may be enough of a limit to openness that musicians give up samples altogether. Likewise, in the complete absence of gene patents, you might imagine that firms would change their behavior toward research based on sequenced genes since the entire area is more open; this is true even if the particular gene sequence they want to investigate was unpatented in the first place, since having to spend time investigating the legal status of a sequence is a burden in and of itself.

July 2014 Working Paper (No IDEAS version). Joshua Gans has also posted a very interesting interpretation of this paper in terms of Coasean contractability.

“Patents and Cumulative Innovation: Causal Evidence from the Courts,” A. Galasso & M. Schankerman (2013)

Patents may increase or hinder cumulative invention. On the one hand, a patentholder can use his patent to ensure that downstream innovators face limited competition and thus have enough rents to make it worthwhile developing their product. On the other hand, holdup and other licensing difficulties have been shown in many theoretical models to make patents counterproductive. Galasso and Schankerman use patent invalidation trials to try and separate out the effect, and the broad strokes of the theory appear to hold up: on average, patents do limit follow-up invention, but this limitation appears to solely result from patents held by large firms, used by small firms, in technologically complex areas without concentrated power.

The authors use a clever IV to generate this result. The patent trials they look at involve three judges, selected at random. Looking at other cases the individual judges have tried, we can estimate the proclivity to strike down a patent for a given judge, and thus predict the probability a certain panel in the future will strike down a certain patent. That is, the proclivity of the judges to strike down the patent is a nice IV for whether the patent is actually struck down. In the second stage of the IV, investigate how this predicted probability of being invalidated, along with covariates and the pre-trial citation path, impact post-trial citations. And the impact is large: on average, citations increase 50% following an invalidation (and indeed, the Poisson IV estimate mentioned in a footnote, which seems more justified econometrically to me, is even larger).

There is, however, substantial heterogeneity. Estimating a marginal treatment effect (using a trick of Heckman and Vycatil’s) suggests the biggest impact of invalidation on patents whose unobservables make them less likely to be overturned. To investigate this heterogeneity further, the authors run their regressions again including measures of technology class concentration (what % of patents in a given subclass come from the top few patentees) and industry complexity (using the Levin survey). They also denote how many patents the patentee involved in the trial received in the years around the trial, as well as the number of patents received by those citing the patentee. The harmful effect of patents on future citations appears limited to technology classes with relatively low concentration, complex classes, large firms with the invalidated patent, and small firms doing the citing. These characteristics all match well with the type of technologies theory imagines to be linked to patent thickets, holdup potential or high licensing costs.

In the usual internal validity/external validity way, I don’t know how broadly these results generalize: even using the judges as an IV, we are still deriving treatment effects conditional on the patent being challenged in court and actually reaching a panel decision concerning invalidation; it seems reasonable to believe that the mere fact a patent is being challenged is evidence that licensing is problematic, and the mere fact that a settlement was not reached before trial even more so. The social welfare impact is also not clear to me: theory suggests that even when patents are socially optimal for cumulative invention, the primary patentholder will limit licensing to a small number of firms in order to protect their rents, hence using forward citations as a measure of cumulative invention allows no way to separate socially optimal from socially harmful limits. But this is at least some evidence that patents certainly don’t democratize invention, and that result fits squarely in with a growing literature on the dangers of even small restrictions on open science.

August 2013 working paper (No IDEAS version).

“Path Dependence,” S. Page (2006)

When we talk about strategic equilibrium, we can talk in a very formal sense, as many refinements with their well-known epistemic conditions have been proposed, the nature of uncertainty in such equilibria has been completely described, the problems of sequential decisionmaking are properly handled, etc. So when we do analyze history, we have a useful tool to describe how changes in parameters altered the equilibrium incentives of various agents. Path dependence, the idea that past realizations of history matter (perhaps through small events, as in Brian Arthur’s work) is widespread. A typical explanation given is increasing returns. If I buy a car in 1900, I make you more likely to buy a car in 1901 by, at the margin, lowering the production cost due to increasing returns to scale or lowering the operating cost by increasing incentives for gas station operators to operate.

This is quite informal, though; worse, the explanation of increasing returns is neither necessary nor sufficient for history-dependence. How can this be? First, consider that “history-dependence” may mean (at least) six different things. History can effect either the path of history, or its long-run outcome. For example, any historical process satisfying the assumptions of the ergodic theorem can be history-dependent along a path, yet still converge to the same state (in the network diffusion paper discussed here last week, a simple property of the network structure tells me whether an epidemic will diffuse entirely in the long-run, but the exact path of that eventual diffusion clearly depends on something much more complicated). We may believe, for instance, that the early pattern of railroads affected the path of settlement of the West without believing that this pattern had much consequence for the 2010 distribution of population in California. Next, history-dependence in the long-run or short-run can depend either on a state variable (from a pre-defined set of states), the ordered set of past realizations, or the unordered set of past realizations (the latter called path and phat dependence, respectively, since phat dependence does not depend on order). History matters in elections due to incumbent bias, but that history-dependence can basically be summed up by a single variable denoting who is the current incumbent, omitting the rest of history’s outcomes. Phat dependence is likely in simple technology diffusion: I adopt a technology as a function of which of my contacts has adopted it, regardless of the order in which they adopted. Path dependence comes up, for example, in models of learning following Aumann and Geanakoplos/Polemarchakis, consensus among a group can be broken if agents do not observe the time at which messages were sent between third parties.

Now consider increasing returns. For which types of increasing returns is this necessary or sufficient? It turns out the answer is, for none of them! Take again the car example, but assume there are three types of cars in 1900, steam, electric and gasoline. For the same reasons that gas-powered cars had increasing returns, steam and electric cars do as well. But the relative strength of the network effect for gas-powered cars is stronger. Page thinks of this as a biased Polya process. I begin with five balls, 3 G, 1 S and 1 E, in an urn. I draw one at random. If I get an S or an E, I return it to the urn with another ball of the same type (thus making future draws of that type more common, hence increasing returns). If I draw a G, I return it to the urn along with 2t more G balls, where t is the time which increments by 1 after each draw. This process converges to having arbitrarily close to all balls of type G, even though S and E balls also exhibit increasing returns.

Why about the necessary condition? Surely, increasing returns are necessary for any type of history-dependence? Well, not really. All I need is some reason for past events to increase the likelihood of future actions of some type, in any convoluted way I choose. One simple mechanism is complementarities. If A and B are complements (adopting A makes B more valuable, and vice versa), while C and D are also complements, then we can have the following situation. An early adoption of A makes B more valuable, increasing the probability of adopting B the next period which itself makes future A more valuable, increasing the probability of adopting A the following period, and so on. Such reasoning is often implicit in the rhetoric linking market-based middle class to a democratic political process: some event causes a private sector to emerge, which increases pressure for democratic politics, which increases protection of capitalist firms, and so on. As another example, consider the famous QWERTY keyboard, the best-known example of path dependence we have. Increasing returns – that is, the fact that owning a QWERTY keyboard makes this keyboard more valuable for both myself and others due to standardization – is not sufficient for killing the Dvorak or other keyboards. This is simple to see: the fact that QWERTY has increasing returns doesn’t mean that the diffusion of something like DVD players is history-dependent. Rather, it is the combination of increasing returns for QWERTY and a negative externality on Dvorak that leads to history-dependence for Dvorak. If preferences among QWERTY and Dvorak are Leontief, and valuations for both have increasing returns, then I merely buy the keyboard I value highest – this means that purchases of QWERTY by others lead to QWERTY lock-in by lowering the demand curve for Dvorak, not merely by raising the demand curve for QWERTY. (And yes, if you are like me and were once told to never refer to effects mediated by the market as “externalities”, you should quibble with the vocabulary here, but the point remains the same.)

All in all interesting, and sufficient evidence that we need a better formal theory and taxonomy of history dependence than we are using now.

Final version in the QJPS (No IDEAS version). The essay is written in a very qualitative/verbal manner, but more because of the audience than the author. Page graduated here at MEDS, initially teaching at Caltech, and his CV lists quite an all-star cast of theorist advisers: Myerson, Matt Jackson, Satterthwaite and Stanley Reiter!

“Did AMD Spur Intel to Innovate More?,” R. Goettler & B. Gordon (2011)

The relation between competition and innovation is theoretically ambiguous. On the one hand, as Schumpeter pointed out, having market power allows you to recover rents from new product sales, so you might expect monopolies to innovate more. On the other hand, innovation is costly, so without competitive pressure, you may simply rest on your laurels and keep selling your old product.

Goettler and Gordon, in a recent JPE, use the Intel/AMD microprocessor competition to investigate this issue. Innovation is easy to measure here – we simply look at the processor speed at the frontier for each firm, and avoid any messy issues about the difference between patented inventions and “actual” inventions. We can also track for over a decade the price differences in each firm’s top chips, the speed differences, and the response. The market is also for all practical purposes a duopoly with very little attempted entry. Computers possess another interesting property, in that they are durable goods. Past products compete with future sales. You may wish to keep prices high when you have market power this period in order not to cannibalize future sales if you expect a good innovation to appear next period for which you can charge even higher prices. Many sectors of the economy involve durable goods, of course.

The authors use a simple model to estimate consumer preferences in a structural model with spillovers (it is harder to push the frontier than to catch up). They find that, if Intel had a monopoly, innovation would have been 4% faster, but consumer surplus would have been 4% lower due to the higher prices charged by Intel, which is the standard Schumpeterian tradeoff. They find consumer surplus is maximized in a world where Intel has some anticompetitive power, though not monopoly power. The reason is that monopoly firms in durable goods markets still need to innovate because of competition with their old products, whereas duopolists can only earn rents to cover R&D costs if the two firms are selling different technologies. There are a number of interesting comparative statics as well. If spillovers are nonexistent, then the two firms race until one has a sufficiently large technological lead, at which point the other firm gives up, and no more innovation takes place, while if spillovers are large, the returns to each firm from doing R&D are low. In both cases, monopolists in a durable goods market innovate more. If spillovers are of an intermediate level, then duopolists will innovate more. As the authors note, “such variation might be one reason cross-industry studies have difficulty identifying robust relationships.”

The estimation involves some technical difficulties which may interest the Pakes-style IO readers. I am not an IO guy myself, so perhaps a reader can comment as to the more general style of this sort of paper. While I find the theory interesting, and am impressed by the difficulty of the empirical estimation, what exactly is the value of this sort of estimation? We know from theory the important qualitative tradeoffs. The style of estimation here can really only be done ex-post – the methods here could not be used, for example, to identify contemporaneously whether a anticompetitive behavior in a particularly durable goods industry is harmful for social welfare. I don’t mean to single this paper out, as this comment applies to a huge number of IO articles.

www.columbia.edu/~brg2114/files/dynduo.pdf (Final working paper)

“Patent Reform: Aligning Reward and Contribution,” C. Shapiro (2007)

Carl Shapiro, in addition to being a bigshot in the academic study of invention, is also a member of Obama’s Council of Economic Advisers. I’m not sure how much of a role he had in advising on the Leahy-Smith patent reform act that was passed last year, but many of the reforms seem to come directly from this NBER Working Paper, so I imagine his role was a big one.

Most academic economists working on IP-related issues think, for a variety of reasons, that IP is currently far stronger than the optimal level. Indeed, many would prefer a world with no patents and copyrights at all to the current system. But let’s take the simplest possible reform: if the social benefit granted by a patent exceeds the social value created by the invention, we ought limit the strength of the patent. You might wonder, how is it even possible for the patentholder to gain more than the social value of his invention? A standard monopolist with a patent still creates consumer surplus and some deadweight loss – that is, social value not captured by the inventor – unless the monopolist is perfectly price discriminating. Shapiro, drawing on a number of earlier papers, gives three nice examples where return to the patentholder exceeds social value. Unless otherwise noted, we assume there is zero deadweight loss created by the patent; if there is deadweight loss, the reason for weakening the patent is even stronger.

First, we know from Loury (1979) and Tandon (1983) that if a patent gives the first firm to invent the full social value of his invention, there will be too much effort expended trying to win that prize; when each firm is deciding whether to expend more effort on R&D, they do not take into account that their increased effort lowers the probability of winning for the other firm. Tandon shows that this “patent race” effect is particularly strong for inventions that are relatively cheap to produce, such as those that are close to obvious. One way to fix this problem somewhat is to allow a second firm who independently invents at roughly the same time as the first firm to invent to sell the product without needing a license. That is, if a product is easy to invent, and two firms expend a lot of effort on it in an attempt to win the patent race, the second firm’s effort is not a total social waste since it may lead to a second independent invention, turning the eventual monopoly (with high deadweight loss) into a duopoly (with lower deadweight loss). Many economists and legal scholars have proposed allowing an independent inventor exception, but Congress has thus far shown no interest in taking up this idea. This is perhaps no surprise: Congress refused to pass the Public Domain Enhancement Act a few years back, an IP-related law that is as big a free lunch as you will ever see.

Second, probabilistic patents are often not challenged. Imagine a patent that, if challenged in court, has a 30% chance of being upheld as valid; many such weak patents exist. Assume that is totally free to challenge the patent, meaning there are no legal or transaction costs. Shapiro shows the following example, drawing on a paper of his with Joseph Farrell. Let a patent with probability .3 of being upheld when challenged be licensed to an oligopolistic downstream industry. The patent adds $10 of value to the products of all downstream inventors, so if the license royalty is greater than $3, the patentholder is earning more than the expected value of his patent. Imagine a royalty of $6. If I challenge the patent in court and my rival does not, then when I win the challenge, I and my rival in the downstream product market are both able to use the invention without paying any license fee, hence our costs are the same, and hence winning the challenge does not earn me any more profits due to competition with my rival. If I lose the challenge, then my rival pays a royalty of only $6, whereas I will have to pay $10 for each unit where I infringe, and hence I will be at a disadvantage in the downstream market. Therefore, neither firm will challenge the patent in equilibrium, and the inventor will earn more than his true social contribution.

Third, hold-up, particularly in the form of the “patent ambush,” can lead to excess returns. Imagine I can sell my product with noninfringing design A at a price of 100 dollars, or with infringing design B, for which I will need to license a previous invention, at a price of 120 dollars. The patent thus increases the value of my product by $20. If I Nash bargain with the inventor, we will split the gains from using his invention in my product, and therefore I will pay $10 to use the invention, and earn $110 per unit by producing design B. This intuition is very different if I first make investments, then learn about the patent. Imagine A and B both require 40 dollars of fixed cost per unit, each, to design. If I don’t know about the patent, I will design product B, and plan to earn 80 dollars per unit. The patentholder will then come to me and tell me I need a license or he will sue for infringement. Once the fixed cost of B is sunk, the surplus from obtaining a license is 20+40=60 dollars, since not obtaining a license means I will need to produce A, which costs another 40 dollars and sells for 20 dollars than design B. So a Nash bargaining outcome is that I pay 30 dollars for the license and produce B. That is, the patentholder can use holdup to extract extra rents after I have made specific investments.

One way to fix the last two problems is to allow informal post-grant challenges to patents, perhaps by third parties. This makes weak patents in important industries less likely to cause hold-up after specific investment, and also limits the ability of third parties to take advantage of the reluctance of licensees to challenge once license terms have been established. The new patent reform does vastly increase the scope for post-grant review.

What’s too bad about the 2011 patent reform is that the types of examples provided by Shapiro above are only the most clean-cut, overwhelmingly obvious ways to improve the efficiency of the patent system. They don’t even pretend to approach what would be necessary for an optimal IP regime. Aside from a handful of congressmen, (Zoe Lofgren and Ron Wyden on the democratic side, or Jason Chaffetz on the Republican side, among them) Congress is filled with IP maximalists. For the sake of social welfare, it’s too bad.

May 2007 NBER Working Paper (IDEAS)

Models of Innovation 2: Sequential Innovation

This post continues a series of notes on the main theoretical models of innovation. The first post covered the patent race literature. Here I’ll cover the sequential innovation literature most associated with Suzanne Scotchmer, particularly in her 1991 JEP and her 1995 RAND with Jerry Green.

Let there be two inventions instead of one, where the second builds upon the first. Let invention 1 cost c1, and invention 2 cost c2, with firm 1 having the ability to invent invention 1, and firm 2 invention 2. If only invention 1 exists, the inventing firm earns v1 (where v1 is a function of patent length T). If both invention 1 and 2 exist, and compete for sales in a market, then they earn v1c and v2c, where c stands for “compete”. If both invention 1 and 2 exist, but are sold by a monopolist, they earn v12>=v1c+v2c. With probability p, 2 will infringe on 1, and hence inventor 2 will need a license to sell product 2.

With one invention, it’s intuitive that the length of the patent should be just long enough to allow the inventor to cover the cost of that invention. This logic does not hold when inventions build on each other. Invention 1 makes invention 2 possible, so it seems we should give some of the social surplus created by invention 2 to the inventor of 1. But doing so makes it impossible to give all of the surplus created by invention 2 to inventor 2. This is a standard problem in the theory of complementary goods: if left shoe has social value 0, and right shoe by itself has social value 0, but the two together have value 1, then the “marginal value” created by each shoe is 1. Summing the marginal values created gives us 2, but the total social value of the pair of shoes is only 1. This wedge between the partial equilibrium concept of marginal value and our intuition about general equilibrium actually comes up quite a bit: willingness-to-pay, by definition, is only meaningful in a partial equilibrium sense despite frequent misuse to the contrary.

So how should efficiently give patent rights with sequential inventions? First assume there is no possibility to form an ex-ante license between the two firms, though of course firms can sell inventions to each other once the product is invented. Also assume that profits are divided using Nash bargaining when firms sell a patent to each other: in this case, each firm garners half of the profit earned using the patents minus the threat point representing what each firm earns in the absence of an agreement. Consider our logic from the one invention case, where we get incentives correct by making patent length just long enough to cover costs: v12(T)=c1+c2, where v12 is the revenue earned by having both products 1 and 2 in the same monopoly firm given patent length T, c1 is the cost of developing product 1, and c2 is the cost to develop product 2. Setting v12(T) just equal to c1+c2 will, in general, provide insufficient incentives for both products to be developed. That is, making patent length long enough that inventor 1 can afford to cover her costs, and the costs of inventor 2, while making precisely zero profit, is insufficient for inducing the invention of both 1 and 2. Why? One reason is that once 2 is invented, the development costs of 2 are sunk. Therefore, once 2 is invented, the licensing agreement will not take into account inventor 2’s costs. Inventor 2, knowing this, may be reluctant to invest in product 2 in the first place.

How might I fix this? Allow ex-ante joint ventures. That is, let firm 1 and firm 2 form a joint venture before the costs of creating invention 2 are sunk. If ex-ante joint ventures is allowed, the optimal patent breadth is p=1: the second invention always infringes. The reason is simply that longer patent length diminishes the bargaining power of firm 2 at the stage in the game where the joint venture is created, 2 knows that he will be required to get an ex-post license after inventing if no joint venture is formed. The Nash bargaining share given to 2 in an ex-post license is always higher if there is a chance that 2 does not infringe because 2’s Nash threat point is higher. Therefore, the share of monopoly profit that needs to be given in an ex-ante joint venture to 2 is higher, because this share is determined in Nash bargaining by the “threat” 2 has of not signing the joint venture agreement, developing product 2, and then signing an ex-post license agreement. Since the total surplus when both products are sold by a monopoly is a fixed amount, giving more profit to 2 means giving less profit to 1. By construction, this increased profit for 2 does change the probability that firm 2 invests in invention 2; rather, the distortion is that less profit to 1 means less incentive for firm 1 to invest in invention 1. So optimal patent breadth is always p=1: follow-up inventions should always infringe.

The intuition above has been modified in many papers. Scotchmer and Green themselves note that that if the value v2c of the second invention is stochastic, and only realizes after firm 2 invests, less than perfectly broad patents can be optimal. Bessen and Maskin’s 2009 RAND, discussed previously on this site, notes that imperfect information across firms about research costs can make patents strictly worse than no patents, because with patents I will only offer joint ventures that are acceptable to low-cost researchers even when social welfare maximization would require both low and high-cost researchers to work on the next invention. A coauthor here at Northwestern and I have a result, which I’ll write up here at some point, that broad patents are not optimal when we allow for multiple paths toward future inventions. Without giving away the whole plot, the basic point is that broad patents cause distortions early on – as firms race inefficiently to get the broad patent – whereas narrow patents cause distortions later – as firms inefficiently try to invent around the patent. The second problem can be fixed with licenses granted by the patentholder, but the first cannot as the distortion occurs before there is anything to license.

The main papers discussed here are Scotchmer and Green’s 1995 RAND (Final RAND copy, IDEAS) and Scotchmer’s 1991 JEP (Final JEP copy, IDEAS). Despite the dates, the JEP was written after the RAND’s original working paper.

“Collective Invention,” R. Allen (1983)

Who invents? Standard theories usually deal just with R&D-performing firms and individual inventors. But enormous amounts of invention come as a byproduct of everyday firm investment. This type of invention tends to be incremental, and tends to be neither patentable nor held secret by the inventor. Robert Allen, in this famous paper from the early 1980s, refers to such invention as “collective invention.”

Consider the British blast furnace industry in the mid 1800s. There was certainly no meaningful corporate R&D at the time, as the world’s first corporate R&D labs were only just appearing then (in the German chemical industry). Yet the blast furnace industry in Cleveland changed enormously over a couple decades, nearly doubling the height of blast furnaces, and more than doubling temperatures. Such changes were greatly beneficial for reducing fuel consumption.

No single firm made these drastic changes overnight. Rather, furnace heights were increased incrementally by some firms when they built a new factory. Benefits in terms of lower fuel use were then made publicly available through personal correspondence, industry gatherings, and journal publications. Two factors were critical in this shift. First, the industry was rapidly adding capital. If a new plant is being built, experimentation has low costs: the cost of adding a foot to the chimney is that efficiency might be harmed slightly, and the benefit is that efficiency may be helped slightly. When an industry is not accumulating capital, this sort of minor experimentation is much more costly, since the only experimentation involves building an entirely new, not-yet-necessary factory. The second critical factor is some reason to avoid secrecy. In the blast furnaces, secrecy was more or less impossible. Builders and workers were frequently moving from plant to plant, and could simply tell their new employer what they learned. Since information is leaking out anyway, it may be an equilibrium to share information in the hopes that others will have useful information to share with me: work by von Hippel, previously discussed here models this sort of sharing in more detail.

The reason we tend to ignore this type of public, incremental innovation is because of a bias, in popular culture and in policy, toward big technological advances. A paper of mine, which I hope to have ready to share here soon, argues that the patent bias toward technological achievement and away from incentivizing the nexus of inventions which lead to a commercially viable product can be seriously harmful. The importance of minor inventions is more than the importance of the famous ones, they shout from the rooftops!

An interesting update of Allen would be in the context of China. To the extent that industries accumulating capital quickly throw off, as a byproduct, incremental inventions, there can be rapidly increasing cost efficiency in even developed industries when some shock causes the industry to switch to a new region with little capital. Peter Hessler, National Geographic’s man in China and a great chronicler of that nation, tells a great story about technology transfer and incremental growth in his book Country Driving. I’m also curious to see how one would distinguish learning-by-doing in aggregate statistics from learning-by-sharing at the plant level.

https://docs.google.com….collinvent.pdf (Final JEBO copy. The only nongated version I can find is this Google cached article.)

“Sequential Innovation, Patents and Innovation,” J. Bessen & E. Maskin (2009)

Back in the 70s and 80s, you could study the economics of invention as if it were static. X is out there and we want to invent it, so how can we incentivize people to do the R&D? The famous tradeoff in patenting is found in the literature: if I give a patent, I generate a deadweight loss triangle from monopoly (plus the “inefficiency rectangle” if you like your firm dynamics with an evolutionary flavor), but if I don’t give a patent, no one has any reason to invest in the fixed cost of R&D. The second part of that argument weakens a bit if firms can keep their invention from being copied for a short while while they extract rents – it takes time to reverse engineer – but the basic tradeoff is still there. (What about the tradeoff between secrecy and disclosure, you may be wondering; my reading of the literature is that the requirement that patents disclose how to make an invention is essentially a meaningless restriction, since firms can choose to rely on trade secrets and since, even when they patent, firms can be obtuse enough in describing certain technical aspects of the invention that it may not be possible for competitors to imitate.)

Recent work on patents, though, takes a much more dynamic attitude toward invention: “If I have seen farther, it is because I have stood…” and all that. When invention is cumulative, the relevant tradeoffs are much more subtle. For instance, if you need A to make B, and both A and B have fixed costs, then who should get the patent to B? The firm who did the R&D work on A that made it possible? The firm that actually invented B? Bessen and Maskin follow this line of reasoning to a surprising conclusion: in some industries, not only do patents decrease social welfare, but they aren’t even in the interest of any of the firms in that industry! I also like this paper as well because it’s nice evidence that (at least some) economists could care less about credentials: James Bessen is a lecturer at a law school with no formal graduate degree of any kind, while Eric Maskin is a Nobel prize-winning economic theorist at Princeton’s Institute for Advanced Study. This paper actually needs them both. Bessen has done a lot of really interesting work on innovation policy, while the result I’ll show shortly relies on some non-obvious game-theoretic reasoning about imperfect information, an area right in Maskin’s wheelhouse.

The basic model is simple. There are two firms, who can either do research for free or at cost c. If one firm does research, the next invention in a sequential line is invented with probability p. If both firms do research, the next invention is found with probability q, where p<q<2p. Each invention in the line has identical value v, drawn from a known distribution, where v represents the incremental value of each new invention. After an invention is made, any firm can imitate; after innovation, any firm can imitate costlessly, in which case each firm gets payoff sv, where s<1/2 is a "share" of the full value v. Patents make imitation impossible. The social planner, of course, would have firms always do research when it is free, and have either one or both of the firms work if they have a cost of research. The choice of zero, one or two firms working when research is not free depends on a cutoff rule on the value of each invention.

Sequential invention makes firms more likely to innovate in the absence of patents than static innovation. The reason is simple: though I am still upset that an imitator may take some of the revenue from an invention I invest in, I also know that if I don’t make that invention, the next product in the sequential line will never be possible. So I need less value from the present invention to make R&D worthwhile. And patenting can make us worse off. The problem is asymmetric information: I don’t know whether the other firm is a high cost or a low cost of R&D firm. When I invent product 1 in a line, the model assumes my patent can keep you from inventing and selling product 2. Of course, I can offer you a license. If I think you have cost 0 R&D, I will set the license equal to exactly the surplus you expect to gain from inventing the next product in a sequence. Then we’ll both work on the next product, and no matter what, I will reap all the surplus. But it may be costly for you to do R&D. In order to get you to do work on the next product, then, I would offer you a licensing fee that just lets you cover the cost of your R&D, but otherwise gives me all the surplus. It turns out that in a range of values, the firm that invents product 1 would rather just get the higher licensing fee always from the zero cost guys, meaning that it never licenses to the high cost guys, even when product values are high enough that social welfare would want to have both firms working on research, and where both firms would indeed do research if there were no patents. Ex-ante, meaning before firms have any patents yet, the firms may well prefer the world without patents to the world with patents for this reason (though, of course, once they have a patent, the firm can only be made worse off by losing it).

One final result, relevant to software and other industries: if the value of inventions is large, and if s is close to one-half (meaning that imitation is not “too” dissipating), then in both the patent and no patent case, firms are better off having a competitor who may imitate/become a licensee. Having a competitor means, basically, the industry is able to invent more quickly. Even when Lotus and Microsoft keep ripping each other off, the fact that they are both inventing new features expands the size of the market, and makes both better off than in the monopolistic world where only one firm did all the R&D.

File these results under “patents can sometimes harm” and “someone tell the law & econ guys that informational frictions matter”.

http://www.sss.ias.edu/files/papers/econpaper25.pdf (2006 Working Paper. The final version is in RAND 2009. If you have access, read the RAND and not the working papers – this paper floated around for a decade or so before publication, and some early versions have a number of mathematical mistakes.)

Follow

Get every new post delivered to your Inbox.

Join 201 other followers

%d bloggers like this: