Category Archives: Theory of the Firm

“Organizations as Information Processing Systems,” R. Daft & R. Lengel (1983)

I don’t believe this paper is well-known by economists, but it has been hugely influential for management and media studies. The theory in this paper is qualitative in the same way economic theory is, but is not mathematical. In this post, I’ll try to reinterpret the main ideas mathematically.

Firms face two primary types of uncertainty. First, the outside environment is uncertain. Second, the internal environment is uncertain. When speech is vague, a manager may misinterpret what the true state of the world is, or subordinates may misinterpret the goals of the organization. When speech is precise, it can be very costly to interpret. Indeed, precise speech about unclear goals is basically worthless: two subordinates may precisely state the answer to two different problems, both of which are different from what the manager wanted to know.

Choice of media, then, can vary. Sometimes speech within an organization is very formal: quantitative models, memos, etc. Sometimes it is informal: face-to-face meetings, informal legends, company lore. The informal speech is able to discuss a broader set of ideas, but with greater ambiguity. The formal speech can present specific ideas exactly, but nothing more. This tradeoff roughly implies the following: when the purpose of a discussion is equivocal or unclear, informal speech should be used to “get us on the same page”. When a discussion involves something routine, precise speech can be used. This has a number of implications: for example, informal communication will be most common at the goal setting stage, or when two different departments are beginning to work together on a task, but formal communication will be most common within a division or after goals have been agreed upon by all parties or when the external environment has less uncertainty.

Clearly, the intersection of language and economics is far more general. For example, equivocality is often introduced on purpose: people speak vaguely, for example, in order than common knowledge does not develop. An example, after a first date: “Would you like to come up to my apartment for some coffee?” Further, vague and precise speech are more than simply vague or precise, but rather are vague and precise in particular ways. Poetry is quoted rather than a meaningless stream of words, for example. Neither the authors or I have much to say on these extensions, but it is definitely an open field right now for some interested researcher.

How might you model the ideas of the present paper mathematically? (Of course, you might ask why these ideas should be modeled mathematically anyway, but I have discussed many times here why social science theory ought be formal, and to the extent that it’s formal, the tools of mathematical logic allow the cleanest possible transmission of ideas and derivation of unexpected consequences, so I won’t rehash those arguments here. Indeed, the whole “should we be formal” discussion seems a bit too meta in the context of this post…) Let the relevant true state be a number in [0,1]^n. Let transmission of the exact state be increasing in its dimension, perhaps linearly. Let transmission of imprecise information be increasing less than linearly, perhaps logarithmically. Imprecise states are interpreted by the receiver with error (something like the truncated exponential version of a normal distribution to ensure we stay in [0,1]^n). Loss functions of the final decision made by the receiver depend on distance from the true state. What should a manager do? Well, on simple decisions where the relevant state is only a point on the line segment [0,1], getting the exact state is cheap, so subordinates should send the manager fairly precise information like a statistical estimate in a memo. On complex decisions, where the relevant state is a point in the 100-dimension [0,1] hypercube, learning the true state will be very expensive (it may require the manager to read a 1000 page quantitative report, for instance), but learning an approximate state will be relatively cheap (it may involve some face-to-face conversations). Once the model is formalized like this, then we can answer questions like “Should management communicate via a hierarchy or not?” I have some plans for work along these lines, using some ideas about transmitting counterfactuals given a set of information partitions, and would definitely appreciate comments concerning how to model this type of media richness. (Working paper)

“Hierarchies and the Organization of Knowledge in Production,” L. Garicano (2000)

The organization of firms, principally but not always in a hierarchy, is related to incentive constraint problems. But this is not all. Informational problems, particularly the transmission of relevant information across the firm to specific individuals who need it, and the collection of knowledge at specific areas in the firm, are also important. Focusing only on that issue, and ignoring incentives, what shape does the firm take?

Garicano considers the following model. Let a firm hire a group of homogenous workers. There is a continuum of “problems” which workers will need to solve. If a worker knows the answer, he solves the problem himself, else he asks someone else for the answer. Workers, after hiring but before production starts, are trained at cost to solve certain problems at a cost linear in the measure of solutions learned. When a solution is not evident and must be asked for, the receiver of the question bears a cost, even if she does not know the answer to that question; time spent answering queries is time not spent producing output. A given worker only has a unit interval of time to spend working or answering questions. Problems arrive at the firm every period according to a known distribution F(Z), where Z is reordered such that the most common problems occur on the left side of the distribution.

In this simple model, one homogenous group of production workers is trained to solve only the easiest problems, and other groups of workers are trained in successively harder problems. Production workers first try to solve the problem themselves, then ask the first level of non production workers for a solution to problems they can’t solve, then ask the second level, etc. This can be proven in four parts; there is nothing tricky. Essentially, divide all the workers into classes where a class specifies what knowledge is learned and what order other classes are asked about problems. First, show that only one class ever produces: if two groups produce output, and the output they produce is not the same, then workers in the less productive group specialize in knowledge to support the more productive group’s production. Second, knowledge never overlaps across classes. since overlapping knowledge is costly and never used. Third, production workers solve the easiest problems, the first level of management solves the slightly harder problems, and so on, with production workers asking levels of management in order until they learn the answer to a problem. If this weren’t true, we could swap an interval of the knowledge held by any two levels, keeping learning costs constant, but now letting easier questions be answered “earlier”, hence reducing communication costs. Fourth, the organization is a hierarchy with fewer workers at higher levels. That final result essentially comes from the fact that only really uncommon problems are solved by workers at the top, hence not many of them are needed.

Making assumptions about the distribution of problem difficult, Garicano also solves for a number of comparative statics when learning costs and communication costs change, or when the production process becomes less predictable in the sense that the distribution of problems shifts rightward. The exact results here rely partly on strong assumptions about the size of the organization, so I omit them here (essentially, everything is proven conditional on a very large firm and exponentially distributed problem difficulty, though the intution behind most of the results probably wouldn’t change were these assumptions to be loosened).

Three final thoughts, related to my current research: is getting knowledge to production workers really the most salient informational issue in firms? This seems backward. One might think that, to the extent firms are organized for knowledge aggregation and transmission reasons, the most important decisions are the ones faced by the boss/president/lead prosecutor, and it is she who uses information held by others in the firm in order to inform her decision. Second, is not contingency-related decisionmaking a relevant concern? That is, often, firms do not even know what problems will arise. Generally, managers specialize in making decisions under those circumstances, solving problems who existence is probably unknown when the firm begins operation. Third, is intrafirm training, particularly at the manager level, that important in real firms? Again, one might imagine that potential workers arrive at the firm endowed with certain knowledge, and then are placed into a role in the firm conditional on that knowledge. This isn’t the say that training doesn’t happen, but surely training paid for by the firm is not always the most common way relevant knowledge is acquired. (Final JPE version)

“How to Count to One Thousand,” J. Sobel (1992)

You have a stack of money, supposedly containing one thousand coins. You want to make sure that count is accurate. However, with probability p, you will make a mistake at every step of the counting, and will know you’ve made the mistake (“five hundred and twelve, five hundred and thirteen, five hundred and….wait, how many was I at?). What is the optimal way to count the coins? And what does this have to do with economics?

The optimal way to count to one thousand turns out to be precisely what intuition tells you. Count a stack of coins, perhaps forty of them, set that stack aside, count another forty, set that aside, and so on, then count at the end to make sure you have twenty-five stacks. If your probability of making a mistake is very high, you may wish only to count ten coins at a time, set them aside, then count ten stacks of ten, setting those superstacks aside, then counting at the end to make sure you have ten stacks of one hundred. The higher the number of coins, and the higher your probability of making a mistake, the more “levels” you will need to build. Proving this is a rather straightforward dynamic programming exercise.

Imagine you’ve hired workers to perform these tasks. If tasks cannot be subdivided, the fastest workers should be assigned to count the first layer of stacks (since they will be repeating the task most often after mistakes are made) and the most accurate are assigned to do the later counts (since they “destroy more value” when a mistake is made, as in Kremer’s O-Ring paper). The counting process will suffer from decreasing returns to scale – the more coins to count, the more value is destroyed on average by a mistake. With optimal subdivision, the number of extra counts needed to make sure the number of stacks is accurate grows slower than the number of coins to be counted, and the optimal stack size is independent of the total number of coins, so counting technology has almost-constant returns to scale.

The basic idea here tells us something about the boundary and optimal organization of a firm, but in a very stylized way. If workers only imperfectly know when mistakes are made, the problem is more difficult, and is not solved by Sobel. If workers definitely do not know when a mistake is made, there still can be gains to subdividing. Sobel mentions a parable about prisoners told by Rubinstein. There are two prisoners who want to coordinate an escape 89 days from now. Both prisoners can see the sun out their window. The odds of one of the two mistaking the day count after that long is quite high, causing a lack of coordination. If both prisoners can also see the moon, though, they need only count three full moons plus five days. (JSTOR gated version – I couldn’t find an ungated copy. Prof. Sobel, hire one of your students to put all of your old papers up on your website!)

“Secrets,” D. Ellsberg (2002)

Generally, the public won’t know even the most famous economists – mention Paul Samuelson to your non-economist friends and watch the blank stares – but a select few manage to enter the zeitgeist through something other than their research. Friedman had a weekly column and a TV series, Krugman is regularly in the New York Times, and Greenspan, Summers and Romer, among many others, are famous for their governmental work. These folks at least have their fame attributable to their economics, if not their economic research. The real rare trick is being both a famous economist and famous in another way. I can think of two.

First is Paul Douglas, of the Cobb-Douglas production function. Douglas was a Chicago economist who went on to become a long-time U.S. Senator. MLK Jr. called Douglas “the greatest of all Senators” for his work on civil rights. In ’52, with Truman’s popularity at a nadir, Douglas was considered a prohibitive favorite for the Democratic nomination would he have run. I think modern-day economists would very much like Douglas’ policies: he was a fiscally conservative, socially liberal reformist who supported Socialists, Democrats and Republicans at various times, generally preferring the least-corrupt technocrat.

The other famous-for-non-economics-economist, of course, is Daniel Ellsberg. Ellsberg is known to us for the Ellsberg Paradox, which in many ways is more important than the work of Tversky and Kahneman for encouraging non-expected utility derivations by decision theorists. Ellsberg would have been a massive star had he stayed in econ: he got his PhD in just a couple years, published his undergrad thesis (“the Theory of the Reluctant Duelist”) in the AER, his PhD thesis in the QJE, and was elected to the Harvard Society of Fellows, joining Samuelson and Tobin in that still-elite group.

As with many of the “whiz kids” of the Kennedy and Johnson era, he consulted for the US government, both at RAND and as an assistant to the Undersecretary of Defense. Government was filled with theorists at the time – Ellsberg recounts meetings with Schelling and various cabinet members where game theoretic analyses were discussed. None of this made Ellsberg famous, however: he entered popular culture when he leaked the “Pentagon Papers” early in the Nixon presidency. These documents were a top secret, internal government report on presidential decisionmaking in Vietnam going back to Eisenhower, and showed a continuous pattern of deceit and overconfidence by presidents and their advisors.

Ellsberg’s description of why he leaked the data, and the consequences thereof, are interesting in and of themselves. But what interests me in this book – from the perspective of economic theory – is what the Pentagon Papers tell us about secrecy within organizations. Governments and firms regularly make decisions, as an entity, where optimal decisionmaking depends on correctly aggregating information held by various employees and contractors. Standard mechanism design is actually very bad at dealing with desires for secrecy within this context. That is, imagine that I want to aggregate information but I don’t want to tell my contractors what I’m going to use it for. A paper I’m working on currently says this goal is basically hopeless. A more complicated structure is one where a firm has multiple levels (in a hierarchy, let’s say), and the bosses want some group of low-level employees to take an action, but don’t want anyone outside the branch of the organizational tree containing those employees to know that such an action was requested. How can the boss send the signal to the low-level employees without those employees thinking their immediate boss is undermining the CEO? Indeed, something like this problem is described in Ellsberg’s book: Nixon and Kissinger were having low-level soldiers fake flight reports so that it would appear that American plans were not bombing Laos. The Secretary of Defense, Laird, did not support this policy, so Nixon and Kissinger wanted to keep this secret from him. The jig was up when some soldier on the ground contacted the Pentagon because he thought that his immediate supervisors were bombing Laos against the wishes of Nixon!

In general, secrecy concerns make mechanism problems harder because they can undermine the use of the revelation principle – we want the information transmitted without revealing our type. More on this to come. Also, if you can think of any other economists who are most famous for their non-economic work, like Douglas and Ellsberg, please post in the comments.

(No link – Secrets is a book and I don’t see it online. Amazon has a copy for just over 6 bucks right now, though).

“Who Will Monitor the Monitor?,” D. Rahman (2010)

In any organization, individuals can shirk by taking advantage of the fact that their actions are private; only a stochastic signal of effort can be observed, for instance. Because of this, firms and governments hire monitors to watch, imperfectly, what workers are doing, and to punish the workers if it is believed that the workers are taking actions contrary to what the bosses desire. Even if the monitor observed signals that are not available to the bosses, as long as that observation is free, the monitor has no incentive to lie. But what if monitoring is costly? How can we ensure the monitor has the right incentives to do his job? That is, who shall monitor the monitor? The answer, clearly, isn’t a third level of monitors, since this just pushes the problem back one more level.

In a very interesting new paper, David Rahman extends Holmstrom’s (who should share the next Nobel with Milgrom; it’s nuts they both haven’t won yet!) group incentives. The idea of group incentives is simple, and it works when monitor’s statements are verifiable. Say it costs 1 to monitor and the agent’s disutility from work is also 1. The principle doesn’t mind an equilibrium of (monitor, work), but better would be the equilibrium (don’t monitor, work), since then I don’t need to pay a monitor to watch my workers. The worker will just shirk if no one watches him, though. Group penalties fix this. Tell the monitor to check only one percent of the time. If he reports (verifiably) that the worker shirked, nobody gets paid. If he reports (verifiably) that the worker worked, the monitor gets $1.02 and the worker gets $100. By increasing the payment to the worker for “good news”, the firm can get arbitrarily close to the payoffs from the “never monitor, work” equilibrium.

That’s all well and good, but what about when the monitor’s reports are not verifiable? In that case, the monitor would never actually check but would just report that the worker worked, and the worker would always shirk. We can use the same idea as in Holmstrom, though, and sometimes ask the worker to shirk. Make payments still have group penalties, but pay the workers only when the report matches the recommended action – that is, pay for “monitor/shirk” and “monitor/work”. For the same reason as in the above example, the frequency of monitoring and shirking can both be made arbitrarily small, with the contract still incentive compatible (assuming risk neutrality, of course).

More generally, a nice use of the Minimax theorem shows that we check for deviations from the bosses’ recommended actions for the monitor and the agent one by one – that is, we needn’t check for all deviations simultaneously. So-called “detectable” deviations are shut down by contracts like the one in the example above. Undetectable deviations by the monitor still fulfill the monitoring role – by virtue of being undetectable, the agent won’t notice the deviation either – but it turns out that finiteness of the action space is enough to save us from an infinite regress of profitable undetectable deviations, and therefore a strategy like the one in the example above does allow for “almost” optimal costly and unverifiable monitoring.

Two quick notes: First, collusion, as Rahman notes, can clearly take place in this model (each agent just tells the other when he is told to monitor or to shirk), so it really speaks only to situations where we don’t expect such collusion. Second, this model is quite nice because it clarifies, again, that monitoring power needn’t be vested in a principal. That is, the monitor here collects no residual profits or anything of that sort – he is merely a “security guard”. Separating the monitoring role of agents in a firm from the management role is particularly important when we talk about more complex organizational forms, and I think it’s clear that the question of how to do so is far from being completely answered. (WP – currently R&R at AER and presumably will wind up there…)

“Do Scientists Pay to be Scientists,” S. Stern (2004)

Performing science as a part of the global scientific community – meaning choosing your own research agenda, publishing your results openly, and receiving prestige based on primacy of those results – is intuitively valuable to intelligent scientists. Who wouldn’t rather be a research scientist instead of some unheard-of engineer? But knowing how valuable it is to have such freedom is difficult to calculate. On the one hand, firms can pay scientists less if they have a preference for science. On the other hand, if access to the scientific community is valuable to firms, and if potential scientists can partially expropriate that value from their firm, then simply measuring pay for research vs non-research scientists will not capture pure preference for the science job among the workers. Beyond this issue are the standard selection problems: maybe research scientists are more capable than non-research scientists, or conversely, maybe non-research scientists take lower paying jobs because the potential for career advancement is better than that of research scientists.

Scott Stern attempts to get around these issues by surveying PhD biologists on the job market. The biology job market generally results in multiple job offers per candidate, and many candidates have competing offers from research and non-research jobs, both in and out of academia. Controlling for individual fixed effects – this is where we need the multiple offers per candidate – Stern finds the average worker is taking a 20% pay cut to work in a research job (one which permits publication of results). This result is roughly robust to removing academic jobs from the sample, to considering only accepted jobs and not just offers, and to an alternative specification where workers are interviewed and asked to rank their job offers from best to worse among a number of categories (ability to follow current research program, monetary rewards, etc.).

My intuition (perhaps incorrect as I do not know of any empirical work on this topic) is that over time, more and more people are moving into jobs where the job itself provides utility at some level, rather than strict disutility; call this “reverse alienation”, perhaps? This actually has massive relevance for policy. For instance, as Stern notes, to the extent that scientists are willing to produce innovation at lower wages when “the scientific community” or “publication priority” is the reward, then strengthening IP or market incentives for innovation may be socially harmful. I think such effects are actually widespread in the labor market, and surely they should be no surprise to academic economists, nearly all of whom are taking huge discounts on what their non-research, or non-academic, salary would be. There are limits to this idea, of course. The 20th century was pretty clear in that market incentives are often far stronger than culturally-induced preferences; just look at crop output differences in China between the Deng era and the Cultural Revolution. Nonetheless… (NBER WP – final version published in Management Science)

“Discovering the Role of the Firm: The Separation Criterion and Corporate Law,” D. Spulber (2009)

(Note: After a Francophilic August hiatus, I’m back from the mountains and will be posting again roughly on a daily basis.)

There are a surprising number of very basic questions about the economic world for which the average undergraduate economic student will never encounter an explanation. For instance, why do businesses in the West sell goods with fixed prices, rather than souq-style bargaining? Despite the Nobel-driven popularity of Oliver Williamson, I imagine that few students are given an explanation for the problem Daniel Spulber considers in this paper: Why do firms exist?

Spulber’s “separation criterion” incorporates aspects of the neoclassical explanation (firms are the “owners” of a production technology, though here there is no reason why firms are entities, rather than informal groupings of individuals), the transaction cost explanation (firms, through repeated games, lower the transaction costs of producing goods; Williamson focuses particularly on the problem of hold-up when contracts cannot specify every possible contingency), and the contracts explanation (firms are a “nexus of contracts” for overcoming moral hazard and other informational problems when undertaking risky ventures). To Spulber, a firm is defined by the complete separation of owner goals and corporation goals. When this separation exists, firms will maximize profits, and owners will unilaterally agree with that objective. For example, if an owner has the goal of eradicating malaria, he will, in theory, prefer the firm he owns to maximize profits, then will spend his residual claim on the firm’s profits supporting charities in line with his public health consumption goal. Many organizations are not firms. Government-run firms generally pursue goals other than profit maximization (that is, they may pursue social policy objectives which are “consumption goods” for the government, thus violating Fisher’s separation theorem). Equal share partnerships are not firms: the owner/managers will each only take actions if their share of the firm profits makes it worthwhile.

This article is published in a law and econ journal, so there is a short discussion of the legal implications of this theory of firms. In particular, policies that destroy the separation of owner and firm maximands can destroy useful efficiency properties. One such policy would be restrictions on selling shares of a firm by the owners. One final caveat: since this is a law journal, the style is very different from what we economists are used to: there is no mathematical model, many seemingly important claims are supported only through references to other papers, and there is extensive interpretive discussion. (Working Paper – Final version in Berkeley Business Law Journal 6.1-2 (2009), which was actually published in 2010 despite the officially listed year)

%d bloggers like this: