Bayesian Persuasion - NAJ Economics

Bayesian Persuasion - NAJ Economics

Bayesian Persuasion Emir Kamenica and Matthew Gentzkow∗ University of Chicago September 2009 Abstract When is it possible for one person to persuade ...

1MB Sizes 0 Downloads 7 Views

Bayesian Persuasion Emir Kamenica and Matthew Gentzkow∗ University of Chicago September 2009

Abstract When is it possible for one person to persuade another to change her action? We take a mechanism design approach to this question. Taking preferences and initial beliefs as given, we introduce the notion of a persuasion mechanism: a game between Sender and Receiver defined by an information structure and a message technology. We derive necessary and sufficient conditions for the existence of a persuasion mechanism that strictly benefits Sender. We characterize the optimal mechanism. Finally, we analyze several examples that illustrate the applicability of our results.

JEL classification: D83, K41, L15, M37 Keywords: strategic communication, disclosure, signalling ∗

We thank Richard Holden for many important contributions to this paper. We would also like to thank Eric Budish, Navin Kartik, Canice Prendergast, Maxwell Stinchcombe, Lars Stole and participants at seminars at University of Mannheim, Duke/Northwestern/Texas IO Theory Conference, Stanford GSB, Simon Fraser University, University of British Columbia, and University of Chicago. This work is supported by the Initiative on Global Markets, the George J. Stigler Center for the Study of the Economy and the State, the James S. Kemper Foundation Faculty Research Fund, the Centel Foundation / Robert P. Reuss Faculty Research Fund, and the Neubauer Family Foundation, all at the University of Chicago Booth School of Business. E-mail: [email protected]; [email protected]

1

1

Introduction

Suppose one person, call him Sender, wishes to persuade another, call her Receiver, to change her action.

If Receiver is a rational Bayesian, can Sender persuade her to take an action he would

prefer over the action she was originally going to take? If Receiver understands that Sender chose what information to convey with the intent of manipulating her action for his own benefit, can Sender still gain from persuasion? If so, what is the optimal way to persuade? These questions are of substantial economic importance.

As McCloskey and Klamer (1995)

emphasize, attempts at persuasion command a sizeable share of our resources. Persuasion, as we will define it below, plays an important role in advertising, courts, lobbying, financial disclosure, and political campaigns, among many other economic activities. Consider the example of a prosecutor trying to convince a judge that a defendant is guilty. When the defendant is indeed guilty, revealing the facts of the case will tend to help the prosecutor’s case. When the defendant is innocent, revealing facts will tend to hurt the prosecutor’s case. Can the prosecutor structure his arguments, selection of evidence, etc. so as to increase the probability of conviction by a rational judge on average?

Perhaps surprisingly, the answer to this question is

yes. Bayes’ Law restricts the expectation of posterior beliefs but puts no other constraints on their distribution. Therefore, so long as the judge’s action is not linear in her beliefs, the prosecutor may benefit from persuasion. To make this concrete, suppose the judge (Receiver) must choose one of two actions: to acquit or convict a defendant. There are two states of the world: the defendant is either guilty or innocent. The judge gets utility 1 for choosing the just action (convict when guilty and acquit when innocent) and utility 0 for choosing the unjust action (convict when innocent and acquit when guilty). The prosecutor (Sender) gets utility 1 if the judge convicts and utility 0 if the judge acquits, regardless of the state. The prosecutor and the judge share a prior belief Pr (guilty) = 0.3. The prosecutor conducts an investigation and is required by law to report its full outcome. We can think of the choice of the investigation as consisting of the decisions on whom to subpoena, what forensic tests to conduct, what question to ask an expert witness, etc. We formalize an investigation as distributions π (·|guilty) and π (·|innocent) on some set of signal realizations. The prosecutor chooses π and must honestly report the signal realization to the judge. Importantly, we assume 2

that the prosecutor can choose any π whatsoever, i.e., that the space of possible investigations is arbitrarily rich. If there is no communication (or equivalently, if π is completely uninformative), the judge always acquits because guilt is less likely than innocence under her prior. If the prosecutor chooses a fully informative investigation, one that leaves no uncertainty about the state, the judge convicts 30 percent of the time. The prosecutor can do better, however. His uniquely optimal investigation is a binary signal π (i|innocent) =

4 7

π (i|guilty) = 0

π (g|innocent) =

3 7

π (g|guilty) = 1.

(1)

This leads the judge to convict with probability 60 percent. Note that the judge knows 70 percent of defendants are innocent, yet she convicts 60 percent of them! She does so even though she is fully aware that the investigation was designed to maximize the probability of conviction. In this paper, we study the general problem of persuading a rational agent. Our approach follows the literature on mechanism design. We consider a setting with an arbitrary state space and action space, and with arbitrary state-dependent preferences for both Sender and Receiver. We introduce a broad class of “persuasion mechanisms” that encompasses cheap talk games (e.g., Crawford and Sobel 1982), persuasion games (e.g., Milgrom and Roberts 1986), and signalling games (e.g., Spence 1973), among many others.

The key distinguishing feature of a persuasion

mechanism is that Sender can affect Receiver’s action only by changing Receiver’s beliefs. We do not allow Sender to make transfers or affect Receiver’s payoffs in any way.

In contrast to most

other papers on strategic communication, we allow for mechanisms where Sender can fully commit on two counts: to fully disclose all he knows and to limit the extent of his private information. Given this definition, we focus on two questions: (i) when does there exist a persuasion mechanism that strictly benefits Sender, and (ii) what is an optimal mechanism from Sender’s perspective? We begin by establishing some results that simplify our analysis. We show that, without loss of generality, we can restrict attention to mechanisms where Sender learns a recommended action for Receiver, reports it truthfully, and then Receiver chooses the recommended action.

In the

example above, we can think of i as a recommendation to acquit and g as a recommendation to convict.

We then show that we can re-express the problem of choosing such a mechanism as a

3

search over distributions of posteriors subject to the constraint that the expected posterior is equal to the prior. When does there exist a persuasion mechanism that strictly benefits Sender? Consider why the prosecutor in the example benefits from the opportunity to provide information to the judge. Since the judge is rational, providing information must sometimes make her more convinced and sometimes less convinced that the defendant is guilty. The former will strictly improve the prosecutor’s payoff if the information is strong enough to induce conviction. The latter, however, will not reduce the prosecutor’s payoff, since the judge already acquits the defendant by default. The net effect is to increase the prosecutor’s payoff in expectation. We show that in general Sender benefits from persuasion whenever (i) Receiver does not take Sender’s preferred action by default (in a sense we make precise below) and (ii) Receiver’s action is constant in some neighborhood of beliefs around the prior. When these conditions hold, Sender can benefit by sending a signal that induces a better action with positive probability and balances this with a worse belief that leaves Receiver’s action unchanged. We also show that whether Sender benefits from persuasion depends in a natural way on the concavity or convexity of Sender’s payoff as a function of Receiver’s beliefs. We next turn to studying optimal mechanisms. We use tools from convex analysis to show that an optimal mechanism exists and to characterize it for any given set of preferences and initial beliefs. We show that no disclosure of information is optimal when Sender’s payoff is concave in Receiver’s beliefs, and full disclosure is optimal when Sender’s payoff is convex in Receiver’s beliefs. We also establish that an optimal mechanism need never induce more actions in equilibrium than there are states. We then generalize three important properties of the optimal mechanism in the example above. Notice, first, that when the judge chooses the prosecutor’s least-preferred action (acquit), she is certain of the state. That is, she never acquits guilty defendants. Otherwise, we would have π (i|guilty) > 0. But then the prosecutor could increase his payoff by decreasing π (i|guilty) and increasing π (g|guilty); this would strictly increase the probability of g and would only increase the willingness of the judge to convict when she sees g. We establish that, in general, whenever Receiver takes Sender’s least-preferred action, she knows with certainty that the state is one where this action is optimal.

4

Second, notice that when the judge convicts, she is exactly indifferent between convicting and acquitting. If she strictly preferred to convict upon seeing g, the prosecutor could increase his payoff by slightly decreasing π (i|innocent) and increasing π (g|innocent); this would increase the probability of g and leave the judge’s optimal action given the message unchanged, thus increasing the probability of conviction. We show that, in general, whenever Receiver has an interior posterior, she is effectively indifferent among two actions. Finally, notice that because the prosecutor’s payoff is (weakly) increasing in the judge’s posterior belief that the state is guilty, it is meaningful to talk about beliefs that place more weight on innocent as being “worse” from the prosecutor’s perspective. A different way to look at the last two results is that the prosecutor chooses an investigation that induces the worst possible belief consistent with a given action by the judge—certainty of innocence when the action is acquit, and indifference when the action is convict. We show that in general when Sender’s payoffs are monotonic in Receiver’s beliefs, Sender typically induces the worst belief consistent with a given action. We next apply our results to three examples. Our first example examines what type of feedback a university should provide to an assistant professor whose research effort depends on her beliefs about the chance that she will get tenure. The second example studies how preference disagreement between Sender and Receiver impacts information transmission under an optimal mechanism. Lastly, we analyze the optimal structure of informative advertisements in a setting with unit demand. These examples illustrate both the breadth of situations captured by our model and the practical applicability of our propositions. Finally, we discuss extensions of our results to dynamic mechanisms, incomplete information on the part of Receiver, multiple Receivers, multiple Senders, limited messaging technologies, and limited commitment. The observation that Bayesian updating only restricts the expectation of posteriors has been made before and has been utilized in a variety of contexts.1 The work most closely related to our 1

The formal methods employed in our analysis are very close to Aumann and Maschler’s (1995) analysis of repeated games of incomplete information. They study the value to a player of knowing which game is being played when the other player lacks this knowledge, a fixed zero-sum game is repeated ad infinitum, players maximize their long-run non-discounted average payoffs, and payoffs are not observed. The fact that the informed player’s initial actions have no impact on his long-run average payoffs (and can thus be treated as just a signal) combined with a focus on Nash equilibria (which implicitly allow for commitment) makes Aumann and Maschler’s problem mathematically analogous to ours.

5

paper is Brocas and Carrillo (2007). They analyze the gain to Sender from controlling the flow of public information in a setting with a binary state space and information that consists of a sequence of symmetric binary signals. Lewis and Sappington (1994) and Johnson and Myatt (2006) consider how much information a monopolist would want to provide to his potential customers. Carillo and Mariotti (2000), Bodner and Prelec (2003), and B´enabou and Tirole (2002, 2003, 2004) employ a form of Bayesian persuasion to study self-signaling and self-regulation. Caillaud and Tirole (2007) rely on a similar mechanism to study persuasion in group settings. Lazear (2006) applies a closely-related intuition to examine when providing information about a test increases learning. In contrast to these papers, we derive results that apply to arbitrary state spaces, information structures, preferences and initial beliefs.2 This paper also relates to a broader literature on optimal information structures. Prendergast (1992) studies the assignment of individuals into groups (and the resulting information about their types) when individuals are risk-averse over the realization of their type. Ostrovsky and Schwarz (2008) examine the equilibrium design of grade transcripts (and the resulting information about quality of students) when schools compete to place their students in good jobs. Rayo and Segal’s (2008) concurrent work characterizes the optimal disclosure policy under specific assumptions about preferences and about Receiver’s outside option. Our results also contribute to the literature on contract theory. An important aspect of our setting is that Receiver’s action is not contractible. Most work in contract theory examines two remedies for such non-contractibility: payment for outcomes correlated with the action (e.g., Holmstrom 1979, Grossman and Hart 1983) and suitable allocation of property rights (e.g., Grossman and Hart 1986, Hart and Moore 1990). Our results highlight another instrument for implementing a second-best outcome, namely the control of the agent’s informational environment.3 Our example on how to optimally structure midterm review of tenure-track faculty so as to induce second-best effort illustrates this interpetation of our results. Finally, past work has studied related questions in contexts where Receivers are not perfect 2

Glazer and Rubinstein (2004, 2006) study related problems where the communication technology effectively limits the set of signals Sender can convey. They focus on Receiver’s part of the problem, however, and their approach differs markedly from that in all of the aforementioned papers. 3 Taub (1997) analyzes the impact of information provision on incentives in a dynamic framework.

6

Bayesians (Mullainathan, Schwartzstein, and Shleifer 2008, Ettinger and Jehiel forthcoming)4 . While persuasive activities may reflect such failures of rationality, assessing the relevant evidence requires a more complete understanding of when and how persuading a fully rational Bayesian is possible.

2

A model of persuasion

Receiver has a continuous utility function u (a, ω) that depends on her action a ∈ A and the state of the world ω ∈ Ω. Sender has a continuous utility function v (a, ω) that depends on Receiver’s action and the state of the world.

Sender and Receiver share a prior µ0 ∈ int (∆ (Ω)).5 Let

a∗ (µ) to be the set of actions that maximize Receiver’s expected utility given her belief is µ. We assume that that there are at least two actions in A and that for any action a there exists a µ s.t. a∗ (µ) = {a}. The action space A is compact and the state space Ω is finite. The latter assumption is mainly for ease of exposition: Appendix B demonstrates that our central characterization result extends to the case where Ω is any compact metric space. A special case of particular interest is where ω is a real-valued random variable, Receiver’s action depends only on the expectation Eµ [ω], rather than the entire distribution µ, and Sender’s preferences over Receiver’s actions do not depend on ω.

This holds, for example, if u (a, ω) =

− (a − ω)2 and v (a, ω) = a. When these conditions are satisfied, we will say that payoffs depend only on the expected state. We define a persuasion mechanism (π, c) to be a combination of a signal and a message technology. Sender’s private signal π consists of a finite realization space S and a family of distributions {π (·|ω)}ω∈Ω over S. A message technology c consists of a finite message space M and a family of functions c (·|s) : M → R+ ; c (m|s) denotes the cost to Sender of sending message m after receiving signal realization s.6 The assumptions that S and M are finite are without loss of generality (cf. Proposition 9) and are used solely for notational convenience. A persuasion mechanism defines a game. The timing is as follows. First, nature selects ω from 4

Cain, Loewenstein, and Moore (2005) provide experimental results on susceptibility to persuasion. int (X) denotes the interior of set X and ∆ (X) the set of all probability distributions on X. 6 R+ denotes the affinely extended non-negative real numbers: R+ = R+ ∪ {∞}. Allowing c to take on the value of ∞ is useful for characterizing the cases where Sender cannot lie and cases where he must reveal all his information. 5

7

Ω according to µ0 . Neither Sender nor Receiver observe nature’s move. Then, Sender privately observes a realization s ∈ S from π (·|ω) and chooses a message m ∈ M . Finally, Receiver observes m and chooses an action a ∈ A. Sender’s payoff is v (a, ω)−c (m|s) and Receiver’s payoff is u (a, ω). We represent the Sender’s and Receiver’s (possibly stochastic) strategies by σ and ρ, respectively. We use µ (ω|m) to denote Receiver’s posterior belief that the state is ω after observing m. A perfect Bayesian equilibrium of a persuasion mechanism is a triplet (σ ∗ , ρ∗ , µ∗ ) satisfying the usual conditions. We also apply an additional equilibrium selection criterion: we focus on Senderpreferred equilibria, i.e., equilibria where the expectation of v (a, ω) − c (m|s) is the greatest. The focus on Sender-preferred equilibria provides a consistent comparison across mechanisms which prevents us from generating benefits of persuasion simply through equilibrium selection. Moreover, this particular comparison, unlike say comparing equilibria worst for Sender, ensures the existence of an optimal mechanism (cf. proof of Proposition 7). In the remainder of the paper, we use the term “equilibrium” to mean a Sender-preferred perfect Bayesian equilibrium of a persuasion mechanism. Motivated by this definition of equilibria, we let a ˆ (µ) denote an element of a∗ (µ) that maximizes Sender’s expected utility at belief µ. If there is more than one such action, we let a ˆ (µ) be an arbitrary element from this set.7 We refer to a ˆ (µ0 ), as the default action. We define the value of a mechanism to be the equilibrium expectation of v (a, ω) − c (m|s). The gain from a mechanism is the difference between its value and the equilibrium expectation of v (a, ω) when Receiver obtains no information.

Sender benefits from persuasion if there is a

mechanism with a strictly positive gain. A mechanism is optimal if no other mechanism has higher value.

2.1

Varieties of Persuasion Mechanisms

A few examples help clarify the varieties of games that are captured by the definition of a persuasion mechanism. If π is perfectly informative and c is constant, the mechanism is a cheap talk game as in Crawford and Sobel (1982).

If π is arbitrary and c is constant, the mechanism coincides

with the information-transmission game of Green and Stokey (2007). If π is perfectly informative and c (m|s) = − (m − s)2 , the mechanism is a communication game with lying costs developed 7

This allows us to use convenient notation such as v (ˆ a (µ) , ω).

8

in Kartik (forthcoming). If π is perfectly informative, M = P (Ω), and c (m|s) = mechanism is a persuasion game as in Grossman (1981) and Milgrom (1981).8

 0 if s∈m

∞ if s∈m / ,

the

If π is perfectly

informative, M = R+ , and c (m|s) = m/s, the mechanism is Spence’s (1973) education signalling game.

The model can also be easily re-interpreted to allow for multiple receivers (as in Lewis

and Sappington 1994), for Receiver to be uncertain about what information Sender has (as in Shin 2003), or for Seller to have discretion over which costly information to acquire (as in Jovanovic 1982).

We consider extensions to our model in Section 7 below, where we also discuss in more

detail what types of games our definition rules out.

2.2

Honest Mechanisms

A particularly important type of a persuasion mechanism is one where M = S and c (m|s) =  k if s=m ∞ if s6=m for some k ∈ R+ . We call such mechanisms honest. In contrast to Grossman (1981) and Milgrom (1981), where Sender is simply not allowed to tell an explicit lie, under an honest mechanism Sender must tell the truth, the whole truth, and nothing but the truth. In other words, he has committed to fully disclose all his private information. Much of our analysis depends on allowing for the possibility that Sender can commit in this way. Note that an honest mechanism can be interpreted as either (i) a choice of a signal π on S given  k if s=m an honest messaging technology c (m|s) = ∞ if s6=m , or (ii) a disclosure rule π : Ω → ∆ (S), as in Rayo and Segal (2008). While these two interpretations are formally equivalent, one or the other may be more natural in particular settings. Examples where the first interpretation makes sense include a prosecutor requesting a forensic test or a firm conducting a public celebrity taste test against a rival product. In these settings, Sender has an ability to commit on two counts: he can choose to remain imperfectly informed and he can commit to fully disclose anything he learns. The latter commitment may arise either because Receiver directly observes the revelation of the information (as in the taste test example) or through an institutional structure that requires Sender to always report all the tests he has conducted and what their outcomes were (as in the prosecutor example). Examples where the second interpretation is more natural include a school or a rating agency 8

P (X) denotes the set of all subsets of X.

9

choosing a coarse grading policy. Here, Sender might be unable to avoid learning the full information about the state but can plausibly commit to an ex ante, potentially stochastic, disclosure rule that is not fully revealing.

3

Simplifying the Problem

The class of persuasion mechanisms defined above is large, including cases where Sender’s strategy might involve complex messages, signaling, lying, and so on. In this section, we show that to determine whether Sender benefits from persuasion and what an optimal mechanism is, it suffices to consider a much simpler problem. The key intuition is the following.

An equilibrium of any persuasion mechanism induces

a particular distribution of Receiver’s beliefs. This distribution of beliefs in turn determines a distribution over Receiver’s actions. From Sender’s perspective, any equilibrium that induces the same distribution of actions conditional on states must have the same value. To determine whether there exists a persuasion mechanism with some given value, therefore, it is sufficient to ask whether there exists a distribution of Receiver’s beliefs that is compatible with Bayes’ rule and generates expected utility for Sender equal to that value. Let a distribution of posteriors τ be a distribution on ∆ (Ω). A persuasion mechanism induces τ if there exists an equilibrium of the mechanism such that Supp (τ ) = {µm }m∈M and (i) µm (·) = µ∗ (·|m) XX (ii) τ (µm ) = σ ∗ (m|s) π (s|ω) µ0 (ω) . ω∈Ω s∈S

A belief µ is induced by a mechanism if τ is induced by the mechanism and τ (µ) > 0. A distribution of posteriors is Bayes-plausible if the expected posterior probability of each state equals its prior probability: Z µdτ (µ) = µ0 . Bayesian rationality requires that any equilibrium distribution of Receiver’s beliefs be Bayesplausible. Our first Proposition below shows that this is the only restriction imposed by Bayesian

10

rationality. That is, for any Bayes-plausible distribution of posteriors there is a persuasion mechanism that induces this distribution in equilibrium. Now, let vˆ (µ) ≡

X

v (ˆ a (µ) , ω) µ (ω) .

ω∈Ω

This denotes Sender’s expected utility when both he and Receiver hold belief µ. Sender’s utility (gross of messaging costs) in any mechanism which induces τ is simply the expectation of vˆ under τ , Eτ vˆ (µ). At terminal nodes of the game, Sender and Receiver may hold different beliefs, say µS and µR . For example, Sender may have observed a highly informative signal but chosen to send a message that reveals no information. Sender’s payoff at such a node is neither P a (µR ) , ω) µS (ω). A more obvious statement, then, would vˆ (µS ) nor vˆ (µR ), but rather ω∈Ω v (ˆ P have been that the Sender’s utility in a mechanism is the expectation of ω∈Ω v (ˆ a (µR ) , ω) µS (ω) over the joint distribution of Sender’s and Receiver’s beliefs.

What allows us to collapse this

potentially complicated expression to Eτ vˆ (µ) is the following observation.

Because Receiver’s

beliefs satisfy the equilibrium condition, it must be the case that from the ex ante perspective, before he has obtained any private information, Sender’s belief conditional on learning that Receiver will have belief µ must also be µ. Hence, his ex ante expected utility from inducing µ is vˆ (µ). Another reason why we do not need to worry about the joint distribution of Sender’s and Receiver’s beliefs is that we can restrict our attention, without loss of generality, to honest mechanisms where Sender’s and Receiver’s beliefs always coincide. In fact, we can restrict our attention even further, to a particular type of honest mechanism. Say that a mechanism is straightforward if it is honest, S ⊂ A, and Receiver’s equilibrium action equals the message. In other words, in straightforward mechanisms, the signal produces a “recommended action” for Receiver, Sender reports the recommendation honestly, and Receiver takes the action recommended.

Given a distribution of

actions induced by any mechanism, there exists a straightforward mechanism that induces the same distribution of actions. This result is closely analogous to the revelation principle (e.g., Myerson 1979). Of course, the revelation principle applies to problems where players’ information is a given, while our problem is that of designing the informational environment. This leads us to the proposition that greatly simplifies our problem.

11

Proposition 1 The following are equivalent: 1. There exists a persuasion mechanism with value v ∗ ; 2. There exists a straightforward mechanism with value v ∗ ; 3. There exists a Bayes-plausible distribution of posteriors τ such that Eτ vˆ (µ) = v ∗ . Detailed proofs of all propositions are in Appendix A. We sketch the basic argument here. That (2) implies (1) and (3) is immediate. To see that (1) implies (2), let α (·|ω) be the distribution of actions in an equilibrium of any mechanism.

Consider the honest mechanism with S = A

and π (a|ω) = α (a|ω). We need to show that in an equilibrium of this mechanism ρ∗ (a|m) = 1 if a=m 0 if a6=m . This follows from two observations: (i) the belief induced by sending message a is a convex combination of beliefs that induced a in the original equilibrium; (ii) if an action is optimal for a set of beliefs, it is optimal for a belief that is in the convex hull of that set.

Finally, that

(3) implies (1) is equivalent to the claim that Bayes-plausability is the only restriction on the equilibrium distribution of posteriors.

This part of our argument is closely related to Shmaya

and Yariv’s (2009) concurrent work that identifies which sequences of distributions of posteriors are consistent with Bayesian rationality. Given any Bayes-plausible τ , let S index Supp (τ ) and consider a signal π (s|ω) =

µs (ω)τ (µs ) µ0 (ω) .

The honest mechanism with signal π induces τ .

The key implication of Proposition 1 is that to evaluate whether Sender benefits from persuasion and to determine the value of an optimal mechanism we need only ask how Eτ vˆ (µ) varies over the space of Bayes-plausible distributions of posteriors. Corollary 1 Sender benefits from persuasion if and only if there exists a Bayes-plausible distribution of posteriors such that Eτ vˆ (µ) > vˆ (µ0 ) . The value of an optimal mechanism is

max Eτ vˆ (µ) τ Z s.t. µdτ (µ) = µ0 .

12

Note that Corollary 1 does not by itself tell us that an optimal mechanism exists. As we will show later, however, this is indeed always the case. Proposition 1 implies that we can restrict our attention to mechanisms where Sender is compelled to report all he knows truthfully. Consequently, none of our results depend on the interpretation of v as Sender’s utility. We could let v denote a social welfare function, for example. Our results would then identify socially-optimal rather than Sender-optimal mechanisms. Or, we could let v denote a weighted combination of Sender’s and Receiver’s utilities resulting from ex ante bargaining over which mechanism to use. Throughout the paper we will refer to v as Sender’s utility, but one should keep in mind that our results apply for any objective function over Receiver’s action and the state. We introduce a final definition that will be useful in the analysis that follows. Let V be the concave closure of vˆ: V (µ) ≡ sup {z| (µ, z) ∈ co (ˆ v )} , where co (ˆ v ) denotes the convex hull of the graph of vˆ. Note that V is concave by construction. In fact, it is the smallest concave function which is everywhere weakly greater than vˆ.9 Figure 1 shows an example of the construction of V .

In this figure, as in all figures in the paper, we

identify a distribution µ with a point in Rn−1 , where n is the number of states. So in Figure 1, if Ω = {ω L , ω R }, the µ on the x-axis is the probability of one of the states, say ω L. Specifying this probability of course uniquely pins down the distribution µ. To see why V is a useful construct, observe that if (µ0 , z) ∈ co (ˆ v ), then there exists a distribution of posteriors τ such that Eτ µ = µ0 and Eτ vˆ (µ) = z. Thus, by Proposition 1, co (ˆ v ) is the set of (µ, z) such that if the prior is µ, there exists a mechanism with value z. Hence, V (µ) is the largest payoff Sender can achieve with any mechanism when the prior is µ. 9

Our definition of concave closure is closely related to the notion of a biconjugate function in convex analysis (Hiriart-Urruty and Lemar´echal 2004). Note that we can alternatively express V as   V (µ) = inf hs, µi − r|hs, µ0 i − r ≥ vˆ µ0 ∀µ0 s,r

where h·, ·i denotes inner product. Hence, V is a “concave version” of the (convex) biconjugate function defined by   vˆ∗∗ (µ) ≡ sup hs, µi − r|hs, µ0 i − r ≤ vˆ µ0 ∀µ0 . s,r ∗∗

Specifically, V = − ((−ˆ v ) ). Aumann and Maschler (1995) refer to V as the concavification of vˆ.

13

VHΜL ß

vHΜL

ß

coHv L

Μ Figure 1: an illustration of concave closure Corollary 2 The value of an optimal mechanism is V (µ0 ).

Sender benefits from persuasion if

and only if V (µ0 ) > vˆ (µ0 ). Figure 2 shows the function vˆ, the optimal mechanism, and the concave closure V in the motivating example from the introduction. In the figure, µ denotes the probability that the state is guilty. As panel (a) shows, vˆ is a step function: the prosecutor’s expected payoff is 0 whenever µ is less than 0.5 (since the judge will choose acquit) and 1 whenever µ is greater than or equal to .5 (since the judge will choose convict). As panel (b) shows, the optimal signal induces two posterior beliefs. When the judge observes i, her posterior belief is µ = 0 and vˆ (0) = 0. When the judge observes g, her posterior belief is µ = .5 and vˆ (.5) = 1. The distribution τ over these beliefs places probability .4 on µ = 0 and probability .6 on µ = .5. Hence, the prosecutor’s expected utility is Eτ vˆ (µ) = .6. The distribution τ is Bayes plausible since µ0 = .3 = .4 (0) + .6 (.5). As panel (c) shows, the concave closure V is equal to 2µ when µ ≤ 0.5 and constant at 1 when µ > 0.5. It is clear that V (µ0 ) > vˆ (µ0 ) and that the value of the optimal mechanism is V (µ0 ).

4

When does Sender benefit from persuasion?

Corollary 2 provides a necessary and sufficient condition for Sender to benefit from persuasion in terms of the concave closure V . In any problem where we can graph the function vˆ, it is 14

ß

vHΜL

1

ß

vHΜL

1

0.5

(a) function vˆ

Μ

ß

vHΜL

VHΜ0 L=0.6

ß

EΤ vHΜL=0.6

Μ0

VHΜL

1

Μ0

0.5

Μ

(b) optimal mechanism

Μ0

0.5

Μ

(c) function V

Figure 2: the motivating example straightforward to construct V and determine the prior beliefs, if any, at which V (µ0 ) > vˆ (µ0 ). In Figure 3, for example Sender benefits from persuasion for any µ0 ∈ (µl , µh ), and does not benefit from persuasion for any µ0 ≤ µl or µ0 ≥ µh . In Figure 2, Sender benefits from persuasion for any µ0 ∈ (0, .5)—i.e. at any prior belief at which the judge does not convict by default. In this section, we characterize the conditions when V (µ0 ) > vˆ (µ0 ) holds, first in terms of the properties of vˆ, and then in terms of the primitives of our model, namely Sender and Receiver’s preferences and initial beliefs. Corollaries 1 and 2 tell us that Sender benefits from persuasion if and only if there exists a τ such that Eτ (ˆ v (µ)) > vˆ (Eτ (µ)) . Whether this is the case is naturally tied to the concavity or convexity of vˆ. Note that since vˆ is not necessarily differentiable, we cannot speak of its convexity or concavity “at a point.” The analogue for a potentially non-differentiable function to being weakly convex everywhere and strictly convex somewhere is that it be convex and not concave. Proposition 2 If vˆ is concave, Sender does not benefit from persuasion for any prior.

If vˆ is

convex and not concave, Sender benefits from persuasion for every prior. Observe that in the simple case where Sender’s payoff does not depend on the state, vˆ (µ) = v (ˆ a (µ)). The concavity or convexity of vˆ then depends on just two things: whether Receiver’s action a ˆ (µ) is concave or convex in µ, and whether Sender’s payoff v (a) is concave or convex in a.

15

VHΜL

VHΜ0 L

ß

vHΜL

ß

vHΜ0 L

Μl

Μ0

Μh

Μ

Figure 3: an illustration with an arbitrary vˆ If both a ˆ and v are concave, Sender does not benefit from persuasion. If both a ˆ and v are convex and at least one of them is not concave, Sender benefits from persuasion. Consider a simple example where u (a , ω) = − (a − ω)2 and v (a, ω) = a.

In this case, both

a ˆ and v are linear, and hence concave, so Sender does not benefit from persuasion.

Moreover,

since vˆ (µ) = Eµ [ω] is linear in µ, Sender is completely indifferent about the information Receiver obtains. This is not because he does not care about Receiver’s belief: his utility is strictly increasing in the expectation of µ.

Rather, it is because he knows that in any informational environment,

Receiver’s beliefs will on average equal the prior, so his expected utility from Receiver’s action is fixed at Eµ0 [ω].10 There is also a sense in which the concavity or convexity of vˆ depends on the extent to which Sender and Receiver’s preferences are aligned. At one extreme, if Sender and Receiver’s preferences P are perfectly aligned (v = u), we have vˆ (µ) ≡ maxa ω u (a, ω) µ (ω) . Since the maximand is linear in µ, vˆ is convex. Moreover, since a ˆ (µ) is not constant, vˆ is not concave. Hence, Sender benefits from persuasion. By the same logic, if Sender and Receiver’s preferences are perfectly misaligned (v = −u), vˆ is concave and Sender can never benefit from persuasion. In Subsection 6.2, we explore alignment of preferences in more detail. 10

This does not mean Sender is indifferent across all mechanisms. Some mechanisms, such as a signalling game, would induce messaging costs in equilibrium and thus lead to a lower overall utility.

16

Often, vˆ will be neither convex nor concave. This is true, for example, in our motivating example as shown in Figure 2.

As we discussed earlier, the fact that Sender benefits from persuasion in

that example hinges on (i) the fact that Receiver does not take Sender’s preferred action by default, and (ii) the fact that Receiver’s action is constant in a neighborhood around the prior. We now show that these two conditions, suitably generalized, play a crucial role more broadly. Specifically, the generalization of (i) is necessary, while generalizations of (i) and (ii) are jointly sufficient, for Sender to benefit from persuasion. To generalize (i), say there is information Sender would share if ∃µ s.t.

vˆ (µ) >

X

v (ˆ a (µ0 ) , ω) µ (ω) .

(2)

ω

In other words, there must exist a µ such that, if Sender had private information that led him to believe µ, he would prefer to share this information with Receiver rather than have Receiver act based on µ0 . Note that when v does not depend on ω, there is information Sender would share as long the default action is not dominant, i.e., v (ˆ a (µ0 )) < v (a) for some a ∈ A. This is the sense in which equation (2) generalizes condition (i). When there is no information Sender would share, Sender cannot benefit from persuasion since there is no informative message he would ever wish Receiver to see. Proposition 3 If there is no information Sender would share, Sender does not benefit from persuasion. Now, to generalize (ii), we say Receiver’s preference is discrete at belief µ if Receiver’s expected utility from her preferred action a ˆ (µ) is bounded away from her expected utility from any other action, i.e., if there is an ε > 0 s.t. s.t. ∀a 6= a ˆ (µ),

X

u (ˆ a (µ) , ω) µ (ω) >

X

u (a, ω) µ (ω) + ε.

The following Proposition is the main result of this section; it demonstrates that the generalizations of (i) and (ii) are sufficient for Sender to benefit from persuasion.

17

Proposition 4 If there is information Sender would share and Receiver’s preference is discrete at the prior, Sender benefits from persuasion. The intuition for the proof is as follows. First, because there is information that Sender would P v (ˆ a (µ0 ) , ω) µh (ω). Second, the discreteness of share we can find a belief µh such that vˆ (µh ) > ω

Receiver’s preference implies that there is a belief near the prior, say µl , such that a ˆ (µl ) is equal to Receiver’s default action and µ0 is on the segment between µl and µh . That mixing point µl and µh produces a strictly positive gain is obvious in a case like the motivating example where Sender’s payoff v does not depend on the state. The argument is more subtle when v does depend on the state. The key observation is that for any given action by Receiver, Sender’s utility is linear in P v (ˆ a (µ0 ) , ω) µ (ω) is linear in µ. This implies that mixing µl with µh yields a µ. In particular, ω

strictly positive gain. Proposition 4 is particularly useful when the action space is finite.

In that case, Receiver’s

preference is generically discrete at the prior in the sense that the set of beliefs at which Receiver’s preference is not discrete is Lebesgue measure-zero in ∆ (Ω).

To see why this is the case, note

that with a finite action space, Receiver’s preference can be non-discrete at µ only if Receiver is exactly indifferent between two distinct actions at µ. Such indifference is a knife-edge case that generically does not hold at the prior. Proposition 5 If A is finite, Receiver’s preference is discrete at the prior generically. The key implication of Proposition 5 is that when the action space is finite, we should expect Sender to benefit from persuasion if and only if there is information Sender would share. Note, however, that this result is not meant to suggest that there is some form of discontinuity in Sender’s benefit from persuasion as we move from large finite choice sets to infinite ones. As the action space becomes large, the gain from persuasion may become arbitrarily small. We now turn to the case where payoffs depend only on the expected state. We have shown that when vˆ can be graphed, inspection of the graph can show directly whether Sender benefits from persuasion. Remember, however, that the domain of vˆ is ∆ (Ω). This means that it is only possible to easily depict vˆ when there are two or three states. When there are more states, our Propositions still apply, but one cannot approach the problem by simply studying the graph of vˆ. 18

When payoffs depend only on the expected state, however, a natural conjecture is that we could learn about Sender’s gain from persuasion by graphing Sender’s expected payoff as a function of the expected state Eµ [ω] rather than as a function of µ directly. If so, we would have a simple two-dimensional representation of this subclass of problems regardless of the size of the state space. When payoffs depend only on the expected state, there exists a ve : R → R such that ve (Eµ [ω]) = vˆ (µ). Let Ve be the concave closure of ve. The following proposition shows that the conjecture above is correct: we can determine whether Sender benefits from persuasion simply by inspecting V˜ and v˜. In Section 6 below, we provide an example of how this result can greatly simplify the analysis of problems with a large state space.   Proposition 6 Sender benefits from persuasion if and only if Ve Eµ0 [ω] > ve Eµ0 [ω] .   To see that Ve Eµ0 [ω] ≤ ve Eµ0 [ω] implies Sender cannot benefit from persuasion, we need only note that for any Bayes-plausible τ ,

 Eτ [ˆ v (µ)] = Eτ [e v (Eµ [ω])] ≤ Ve (Eτ [Eµ [ω]]) = Ve Eµ0 [ω] .   The proof of the converse is more involved. If Ve Eµ0 [ω] > ve Eµ0 [ω] , we know there is a τ s.t. Eτ [Eµ [ω]] = Eµ0 [ω] and Eτ [ˆ v (µ)] > vˆ (µ0 ). If this τ were Bayes-plausible, we could construct an honest mechanism that induces it and we would be done. The trouble is that Eτ [Eµ [ω]] = Eµ0 [ω] does not guarantee that τ is Bayes-plausible. To construct a persuasion mechanism with a strictly positive gain, we show that it is always possible to find a belief µ0 such that Eµ0 [ω] = Eµ0 [ω] and a Bayes-plausible τ 0 that is a mixture of τ and µ0 . Since Eτ [ˆ v (µ)] > vˆ (µ0 ) and vˆ (µ0 ) = vˆ (µ0 ), we know that Eτ 0 [ˆ v (µ)] > vˆ (µ0 ).11

5

Optimal mechanisms

Corollary 2 shows that the value of an optimal mechanism is V (µ0 ). When the problem is simple enough that it is possible to graph vˆ and V , we can read V (µ0 ) and the gain V (µ0 ) − vˆ (µ0 ) off 11

As the detailed proof in Appendix A shows, we can also establish a result somewhat stronger than Proposition 6. Suppose there exists a linear T : ∆ (Ω) → Rk and a ve : Rk → R s.t. vˆ (µ) = ve (T µ). Then, Sender benefits from persuasion if and only if ve is below its concave closure at T µ0 . We focus on the case where T µ = Eµ [ω] only so we can simplify the exposition of the result.

19

the graph directly. Figure 3 illustrates this for the arbitrary vˆ shown earlier in Figure 1. Similarly, in Figure 2, we can easily see that the value of the optimal mechanism in our motivating example must be V (µ0 ) = .6, and the gain is V (µ0 ) − vˆ (µ0 ) = .6 − 0 = .6. The graph of vˆ and V also identifies the beliefs induced by the optimal mechanism—these are the points on vˆ whose convex combination yields value V (µ0 ). In Figure 3, we see that the optimal mechanism induces beliefs µl and µh . It is clear from this figure that the optimal mechanism is unique, since there are no beliefs other than µl and µh that could be combined to produce value V (µ0 ). Similarly, we can easily see in Figure 2 that the optimal mechanism must induce beliefs µ = 0 and µ = .5. From here, it is straightforward to construct the optimal mechanism. The condition that R

µdτ (µ) = µ0 allows us to compute the distribution τ . For the motivating example in Fig-

ure 2, we must have 0τ (0) + .5τ (.5) = .3, so τ (0) = .4 and τ (.5) = .6. Following the proof of Proposition 1, we then set: µs (ω) τ (µs ) µ (ω)  0 0 if s = m c (m|s) = . ∞ if s 6= m π (s|ω) =

For the motivating example, this yields the π in Equation 1. These steps rely on being able to visualize the graphs of vˆ and V , which is only possible when the number of states is small. More generally, vˆ is upper semicontinuous12 , which implies that any element of the graph of V can be expressed as a convex combination of elements of the graph of vˆ.13 In particular, there exists a Bayes-plausible distribution of posteriors τ such that Eτ vˆ (µ) = V (µ0 ). From τ and the prior µ0 , we can construct π (s|ω) and c (m|s) from the expressions above. Then, (π, c) is an optimal mechanism. Thus: Proposition 7 An optimal mechanism exists. We first characterize the optimal mechanism in terms of the convexity or concavity of vˆ. Say 12 13

This is a consequence of our focus on Sender-preferred equilibria. We show this in Lemma 2. We establish this implication in Lemma 4.

20

that no disclosure is optimal if µ∗ (·|m) = µ0 for all m sent in equilibrium of an optimal mechanism.14 If, at the other extreme, µ∗ (·|m) is degenerate for all m sent in equilibrium of an optimal mechanism, say that full disclosure is optimal.15 Finally, if µ∗ (·|m) is at the boundary of ∆ (Ω) for all m sent in equilibrium of an optimal mechanism, we say that strong disclosure is optimal. Proposition 8 For any prior, • if vˆ is (strictly) concave, no disclosure is (uniquely) optimal; • if vˆ is (strictly) convex, full disclosure is (uniquely) optimal; • if vˆ is convex and not concave, strong disclosure is uniquely optimal. The first two parts of Proposition 8 follow directly from the definition of convexity and concavity. To see why the third part holds, first note that for any µ induced by an optimal mechanism, it must be the case that V (µ) = vˆ (µ). This observation will also be quite useful for several other Propositions. Now, if vˆ is not concave, there is some belief µl s.t. V (µl ) > vˆ (µl ). If an optimal mechanism induces an interior belief, this belief can be expressed as a convex combination of µl and some other belief. But then, V (µ) = vˆ (µ) and V (µl ) > vˆ (µl ) coupled with concavity of V imply that vˆ cannot be convex. Next, we show that in an optimal mechanism, the number of actions Sender needs to induce in equilibrium is limited by the number of states.16 Proposition 9 There exists an optimal straightforward mechanism in which the number of actions Receiver takes with positive probability in equilibrium is at most |Ω| . Note, first, that this proposition is clearly true for our motivating example, where there are two states and the optimal mechanism induces exactly two actions. It is also true for the example of Figure 3, where there are also two states and the optimal signal induces two beliefs µl and µh ; since we have assumed no belief induces more than one action in equilibrium, the optimal mechanism induces no more than two actions. 14 Note that in contrast to Sender, who can be hurt by revelation of information, Receiver is always made weakly better off by any mechanism. 15 We say µ is degenerate if there is an ω s.t. µ (ω) = 1. 16 In fact, as the proof of the proposition shows, we establish a somewhat stronger result: there exists an optimal straightforward mechanism that induces a τ whose support has cardinality no greater than that of Ω.

21

Formally, the proof of Proposition 9 hinges on showing that any point on the graph of V can be written as a convex combination of at most |Ω| points on the graph of vˆ. This is closely related to Caratheodory’s theorem, which shows that if a point p is in the convex hull of a set of points P in RN , it is possible to write p as a convex combination of N + 1 or fewer of the points in P . This would be sufficient to prove a version of Proposition 9 where the bound is |Ω| + 1 rather than |Ω|. To prove that only |Ω| points are required, we rely on a related result from convex analysis.17 The intuition for Proposition 9 is best seen graphically. Referring back to Figure 3, it is easy to see that when there are two states, V will always be made up of points on vˆ and line segments whose endpoints are on vˆ. This means we can write any point on the graph of V as a convex combination of at most two points of the graph of vˆ. More generally, V will always be made up of points on vˆ and the (|Ω| − 1)-dimensional faces whose vertices are on vˆ. Any point on such a face can be written as a convex combination of at most |Ω| points of vˆ. The bound provided by Proposition 9 is tight - one can easily construct examples where the optimal mechanism cannot be achieved with a signal that has fewer than |Ω| possible realizations. For instance, whenever full disclosure is uniquely optimal, no mechanism with |S| < |Ω| can be optimal. It is also worthwhile to note that this is a property specifically of optimal mechanisms. One can construct an example in which there exists a mechanism with value v ∗ but there is no mechanism with |Ω| or fewer actions induced in equilibrium that also gives value v ∗ . Proposition 9, however, implies that such a v ∗ must be strictly less than the value of an optimal mechanism. We now turn to generalizing three features of the optimal mechanism in our motivating example: (i) whenever the judge chooses the prosecutor’s least-preferred action (acquit), the judge is certain of the state; (ii) whenever the judge chooses an action that is not the prosecutor’s least-preferred (convict), she is indifferent between that action and a worse one; (iii) the prosecutor always induces the worst belief consistent with a given action by the judge. To generalize (i), we show that if there exists an action a which is Sender’s least-preferred regardless of the state, then at any µ induced by an optimal mechanism that leads Receiver to choose a, Receiver is certain that a is her optimal action. 17 Caratheodory’s Theorem guarantees that given any mechanism with value v ∗ (optimal or otherwise), there is a mechanism that induces no more than |Ω| + 1 actions which also yields v ∗ . Fenchel and Bunt’s strengthening of Caratheodory’s theorem implies that given an optimal mechanism, we need no more than |Ω| actions.

22

Proposition 10 Suppose that v (a, ω) < v (a, ω) for all ω and all a 6= a.

Suppose that µ is

6 max u (a, ω), we have induced by an optimal mechanism and a ˆ (µ) = a. Then, for any ω s.t. {a} = µ (ω) = 0. Intuitively, suppose that in a straightforward mechanism there is a belief which induces a but puts positive probability on a state ω where a is not optimal. For this to be true, Sender must report message a with positive probability in state ω. Sender’s payoff would be strictly higher in a mechanism that simply revealed that the true state is ω in all such cases, because this would induce Receiver to choose an action Sender likes strictly better than a and would leave Receiver’s actions given all other messages unchanged. To generalize (ii), we establish that at any interior µ induced by an optimal mechanism, Receiver’s preference for a ˆ (µ) cannot be discrete. To show this result, it is necessary to rule out a pathological case where there can be a default action such that there is no information Sender would share, yet there is some other action which yields exactly the same utility to Sender as the default action. Assumption 1 There exists no action a s.t. (i) ∀µ, vˆ (µ) ≤ P a 6= a ˆ (µ) and vˆ (µ) = v (a, ω) µ (ω) .

P

v (a, ω) µ (ω) and (ii) ∃µ s.t.

ω

ω

Since this assumption rules out only a particular type of indifference, we conjecture the set of Sender’s and Receiver’s preferences that violate Assumption 1 has Lebesgue measure zero. Proposition 11 Suppose Assumption 1 holds.

If Sender benefits from persuasion, Receiver’s

preference is not discrete at any interior µ induced by an optimal mechanism. For an intuition for this result, note that if Receiver’s preference at µ were discrete, then if µ were the prior Sender could benefit from persuasion unless there was no information Sender would share (by Proposition 4).

So, since we know V (µ) = vˆ (µ), it must be the case that if µ were

the prior, there would be no information Sender would share. That would mean that a ˆ (µ) is an extremely desirable action for Sender, so he would want to maximize the chance of inducing it. But, if µ were discrete, there would be another belief, close to µ, which would yield the same action and which Sender could induce more often than µ. Hence the Proposition above must hold. 23

The fact that Receiver’s preference at µ is not discrete means that, in at least one direction, Receiver’s action changes at µ.

Note that Sender must weakly dislike at least one such change;

otherwise he would be better off providing more information. Moreover, when the action space is finite, Proposition 11 implies that at any interior belief induced by an optimal mechanism, Receiver is indifferent between two actions. Hence, in that case, the optimal mechanism necessarily brings Receiver as close as possible to taking some action less desirable than the equilibrium one. To generalize (iii), say that vˆ is monotonic if for any µ, µ0 , vˆ (γµ + (1 − γ) µ0 ) is monotonic in γ. When vˆ is monotonic in µ, it is meaningful to think about beliefs that are better or worse from Sender’s perspective. The simplest definition would be that µ is worse than µ0 if vˆ (µ) ≤ vˆ (µ0 ). Note, however, that because v (a, ω) depends on ω directly, whether µ is worse in this sense depends both on how Receiver’s action changes at µ and how µ affects Sender’s expected utility directly. It turns out that for our result we need a definition of worse that isolates the way beliefs affect Receiver’s actions. When vˆ is monotonic, there is a rational relation on A defined by a % a0 if vˆ (µ) ≥ vˆ (µ0 ) whenever a = a ˆ (µ) and a0 = a ˆ (µ0 ). This relation on A implies a partial order on ∆ (Ω): say that µ B µ0 if   Eµ u(a, ω) − Eµ u a0 , ω > Eµ0 u (a, ω) − Eµ0 u a0 , ω for any a % a0 . In other words, a belief is higher in this partial order if it makes better actions (from Sender’s perspective) more desirable for Receiver. The order is partial since a belief might make both a better and a worse action more desirable for Receiver. We say that µ is a worst belief inducing a ˆ (µ) if there is no µ0 C µ s.t. a ˆ (µ) = a ˆ (µ0 ). We then have the following: Proposition 12 Suppose Assumption 1 holds. If vˆ is monotonic, A is finite, and Sender benefits from persuasion, then for any interior belief µ induced by an optimal mechanism either: (i) µ is a worst belief inducing a ˆ (µ), or (ii) both Sender and Receiver are indifferent between two actions at µ. We have already discussed the basic intuition behind this Proposition: the expected posterior must equal the prior; hence more undesirable beliefs that induce a given action increase the probability of beliefs that induce a more desirable action. 24

Proposition 12 is the reason why, in our

initial example, when the judge convicts she is barely willing to do so. Proposition 12 shows that the force behind this result applies more broadly. Case (ii) is necessary to deal with the possibility of a belief which could be interpreted both as a worst or a best belief inducing a ˆ (µ): if both Sender and Receiver are indifferent between, say a and a0 at µ, the choice of a ˆ (µ) from {a, a0 } is entirely arbitrary. However, for generic preferences, there will be no belief where both Sender and Receiver are indifferent between two actions at a same belief. Finally, we consider the case where payoffs depend only on the expected state.

Recall that

we established earlier that in this case there is a function ve s.t. ve (Eµ [ω]) = vˆ (µ), and that   Sender benefits from persuasion if and only if Ve Eµ0 [ω] > ve Eµ0 [ω] . We might conjecture  that the value of an optimal mechanism is Ve Eµ0 [ω] . This conjecture turns out to be false. Recall from the discussion of Proposition 6 that even though we know there is always a τ s.t.  Eτ [Eµ [ω]] = Eµ0 [ω] and Eτ [ˆ v (µ)] = Ve Eµ0 [ω] , such τ need not be Bayes-plausible. In order to show that Sender could benefit from persuasion, we had to mix the beliefs in the support of τ with another belief µ0 such that Eµ0 [ω] = Eµ0 [ω]. This reduces the value of the mechanism strictly  below Ve Eµ0 [ω] . For a concrete example, suppose that A = [0, 1], Ω = {−1, 0, 1}, u (a, ω) = − (a − ω)2 , and  v (a, ω) = a2 . In this case, vˆ (µ) = ve Eµ0 [ω] = (Eµ [ω])2 . Hence, Ve is constant at 1 and in  particular Ve Eµ0 [ω] = 1. Yet, whenever the prior puts any weight on ω = 0, the value of any mechanism is strictly less than 1. Specifically, suppose that the prior µ0 is (ε/2, 1 − ε, ε/2). that case, when ε is close to 0, the value of an optimal mechanism is close to 0.

In

Hence, when

payoffs depend only on the expected state, we can use ve to determine whether Sender benefits from persuasion, but not to determine an optimal mechanism or its value.

To do that, we need to

analyze vˆ directly or derive the properties of an optimal mechanism from Propositions 11, 12, and 10.

25

6

Examples

In this section we develop several examples that are meant to demonstrate the breadth of settings captured by our model and illustrate the ways in which the results developed in the previous two sections can be applied.

6.1

Tenure-track midterm review

We begin with an example that illustrates the applicability of our results to problems from contract theory. We examine what type of information a university should provide to an assistant professor whose willingness to exert effort depends on her likelihood of getting tenure. This setting is one where structuring information provision may be a particularly useful way to induce effort because other instruments are less potent than usually.

Paying the professor based on the quality of

her research may be infeasible as this quality is likely to be non-verifiable, while the institutional structure of universities makes it difficult to motivate untenured faculty by suitably allocating property rights. An assistant professor chooses an effort level a ∈ [0, 1] to exert before coming up for tenure. There are two types of individuals denoted by ω ∈ {1, 2}. The university and the professor share a prior µ0 over the professor’s type. The quality of research produced by an individual of type ω who exerts effort a is aω. At the end of the tenure clock, the individual is tenured if the quality of her research is above some cutoff level, say 3/2.18 If she is tenured, she receives utility aω − a2 , receiving the recognition for the quality of her research (aω), but suffering a disutility a2 from her effort. If she is not tenured, she leaves academia and receives no recognition for her research but still suffers the sunk cost of effort, i.e. her utility is −a2 . The university wants to maximize the expected quality of the research produced by the professor. It conducts a midterm review which results in a signal π : {1, 2} → ∆ (S).

What type of a midterm review process maximizes the

university’s objective? We begin by computing vˆ, denoting Pr (ω = 2) by µ. Simple algebra reveals that the professor’s 18

Note that such a rule is feasible even if quality of research is non-verifiable if the university wants to give tenure when the quality exceeds this threshold.

26

optimal effort is:    0 if µ < 3/8    a ˆ (µ) = 3/4 if 3/8 ≤ µ < 3/4      µ if µ ≥ 3/4. Hence, the university’s expected utility given the professor’s belief is:

vˆ (µ) =

     

0

if µ < 3/8

3/4 (1 + µ) if 3/8 ≤ µ < 3/4      µ + µ2 if µ ≥ 3/4.

What do our Propositions tell us about this example? Propositions 2 and 8 do not apply since vˆ is neither convex nor concave. Also, since there is information the university would share, Proposition 3 does not rule out the possibility that the university might benefit from persuasion. In fact, Proposition 4 directly tells us that, at least if µ0 < 3/8, the university will benefit from persuasion.

Proposition 9 tells us that the optimal midterm review need not induce more than

two distinct effort levels.

Finally, Proposition 10 implies that whenever the professor exerts no

effort, she is completely certain that she is a low type. The reason for this feature of the optimal review is that any posterior weight on the high type is “wasted” when the posterior is below 3/8 since that weight is still insufficient to lift effort below zero and yet it reduces the probability of a more favorable, effort-inducing, opinion the professor can have about herself. Because of the simplicity of the state space in this example, we can also easily depict vˆ and its concave closure (Figure 4). The figure makes it clear that the university generically benefits from persuasion.19 Moreover, it shows that the optimal structure of the performance review depends on the prior. When the initial probability that the professor is a high type is below 3/8, the optimal review induces posteriors µ = 0 and µ = 3/8.

When the prior is above 3/8, then the optimal

review induces µ = 3/8 and µ = 1. Note that regardless of what the prior is, it is never optimal to fully reveal the type. When the prior is below 3/8, revealing that the type is high with certainty is very costly because this realization happens too rarely relative to the effort it induces. When the 19

No disclosure is optimal only when the prior is exactly 3/8.

27

ß

ß

vHΜL

vHΜL

VHΜL

38

34

Μ

38

(a) function vˆ

34

Μ

(b) function V

Figure 4: optimal midterm review prior is above 3/8, revealing that the type is low with certainty is very costly because effort drops discontinuously when prospect of tenure becomes too dim (i.e., when µ falls below 3/8). This simple model illustrates the way in which the tools developed in this paper can be used to study optimal feedback when actions are not fully contractible.

Moreover, as we mentioned

earlier, by redefining v we can use the same approach to find mechanisms that maximize social welfare rather than the university’s preferences.

6.2

Preference disagreement

Here we consider the question of how certain types of preference disagreement between Sender and Receiver affect the amount of information transmitted under the optimal persuasion mechanism. Let u = − (a − ω)2 , v = − (a − (b1 + b2 ω))2 , and A = Ω = [0, 1].20 Parameters b1 and b2 capture two types of preference disagreement: values of b1 away from 0 indicate that Sender and Receiver disagree about the best average level of a while values of b2 away from 1 indicates that they disagree about how the action should vary with the state. 20

Recall that Appendix B shows our approach extends to compact metric state spaces.

28

In order to determine how information transmission depends on b1 and b2 it will suffice to compute vˆ. Since u = − (a − ω)2 we know that

a ˆ (µ) = Eµ [ω] .

Given this a ˆ, simple algebra reveals that

  vˆ (µ) = −b21 + 2b1 (1 − b2 ) Eµ [ω] − b22 Eµ ω 2 + (2b2 − 1) (Eµ [ω])2 . Hence, by Corollary 1, Sender solves h i   max Eτ −b21 + 2b1 (1 − b2 ) Eµ [ω] − b22 Eµ ω 2 + (2b2 − 1) (Eµ [ω])2 τ Z s.t. µdτ (µ) = µ0 .   Since the −b21 + 2b1 (1 − b2 ) Eµ [ω] − b22 Eµ ω 2 term is constant across all Bayes-plausible τ ’s, this maximization problem simplifies to h i max Eτ (2b2 − 1) (Eµ [ω])2 τ Z s.t. µdτ (µ) = µ0 .

Hence, b1 does not affect Sender’s problem. when b2 >

1 2,

Moreover, vˆ is linear when b2 = 21 , strictly convex

and strictly concave when b2 <

uniquely optimal when b2 <

1 2

1 2.

Therefore, by Proposition 8, no disclosure is

and full disclosure is uniquely optimal when b2 > 12 . When b2 =

1 2

all mechanisms yield the same value. If we consider a simpler state space, we can develop a geometric intuition for this example.  Keeping utility functions and the action space the same, suppose that Ω = 0, 12 , 1 .21 Figure 5 depicts vˆ for a few values of b2 . Recall that, for the reasons we just discussed, the shape of vˆ does not depend on b1 ; this parameter only affects the vertical location vˆ. Figure 5 clearly demonstrates that b2 changes the concavity of vˆ. 21

While these graphs provide some geometric intuition, it is

Note that the expression for vˆ above still applies with this simplified state space.

29

(a) b2 = 0

(b) b2 =

1 2

(c) b2 = 1

Figure 5: shape of vˆ depends on b2 worthwhile to note that the algebra above provides an illustration of the practical usefulness of formulating Sender’s problem in terms of vˆ even when the state space is too rich for this function to be depicted. Now, what do the results above tell us about the impact of preference disagreement on information transmitted in an optimal mechanism? First, note that b1 has no impact on the relative value of any two mechanisms. This stands in sharp contrast to the results from cheap talk games. Crawford and Sobel’s (1982) main example considers preferences of the form u = − (a − ω)2 , v = − (a − ω − b1 )2 , which are equivalent to our example when b2 = 1. They show that as soon as |b1 | > 14 , no communication is possible in any equilibrium of the cheap talk game. By comparison, the optimal persuasion mechanism induces full disclosure for any value of b1 . The reason for this difference is that Sender’s commitment frees him from the temptation to try to increase the average level of Receiver’s action, the temptation which is so costly both for him and for Receiver in the equilibrium of the cheap talk game. Under any mechanism, Receiver’s expected action is a given. Moreover, conditional on Receiver’s expected action, Sender and Receiver completely agree on which actions are more desirable in which state. Hence, providing Receiver with anything short of a fully informative signal only induces an additional cost to Sender. The other parameter, b2 , measures disagreement over how the action should vary with the state.

When b2 = 1, Sender and Receiver completely agree on how action should vary with

the state.

Hence, full disclosure is uniquely optimal.

When b2 > 1, Receiver’s actions are

insufficiently sensitive to the state from Sender’s perspective.

Because Receiver under -reacts to

information, however, Sender still strictly prefers to fully disclose everything, even if preference disagreement is very large, because anything short of that would only exacerbate the problem of

30

under-reaction.

When 0 < b2 < 1, Receiver’s actions are overly sensitive to the state, though

the action always moves in the right direction. When the level of disagreement is sufficiently small  b2 > 21 , Sender still prefers to reveal all information, but when Receiver’s over-reaction becomes  too severe b2 < 12 , Sender cuts all information flow to Receiver. Finally, when b2 < 0, Receiver reacts to any information in a way opposite to the one Sender wishes, so he reduces her information as much as possible.

6.3

Informative advertisements with unit demand

Now consider an example where a firm faces a continuum of ex ante identical consumers, each of whom decides whether to buy one unit of the firm’s product or not. The firm’s product has quality ω ∈ [0, 1] and the consumers’ utility from purchasing the product is ω − p, where p is an exogenous price, also in the unit interval.

The consumers and the firm share the prior µ0 on the quality

of the product. The assumption of symmetric information is most palatable if we conceptualize quality as driven by the match between consumers’ uncertain tastes and the products’ uncertain characteristics. Since consumers are risk-neutral, each will buy the product if and only if Eµ [ω] ≥ p where µ is their posterior belief about the quality of the product. To make things interesting, we suppose that consumers’s default action is not to buy, i.e., Eµ0 [ω] < p. The firm chooses an advertising campaign which provides a signal about quality π : [0, 1] → ∆ (S). All consumers see the advertisement. We assume that the firm is risk-neutral and only cares about the expected revenue, so it does not matter whether each consumer gets an independent realization of the signal (which would mean they end up with heterogeneous posteriors) or all consumers get the same realization (which means they would have identical posteriors). Because consumers are ex ante homogeneous, our results apply directly to this example even though we have multiple Receivers.22 We again begin by computing vˆ. Denoting the decision to buy with 1 and the alternative with 22

This example is closely related to the analysis in Lewis and Sappington (1994).

31

0, we have

a ˆ (µ) =

vˆ (µ) =

   0 if Eµ [ω] < p   1 Eµ [ω] ≥ p    0 if Eµ [ω] < p   1

.

Eµ [ω] ≥ p

Moreover, since payoffs depend only on the expected state, we can define ve (Eµ [ω]) = vˆ (µ). What do our Propositions tell us about this example? Propositions 2 and 8 do not apply since vˆ is neither convex nor concave. Because Eµ0 [ω] < p, there is information the firm would share. Hence, Proposition 3 does not exclude the possibility that the firm might benefit from persuasion. In fact, since consumers’ preference for not buying is discrete at the prior, Proposition 4 implies that the firm does benefit from persuasion. Moreover, we know that any consumer who buys the product will have the posterior µ s.t. Eµ [ω] = p (Proposition 12), while any consumer who does not buy will have a posterior that puts zero weight on the possibility that ω ≥ p (Proposition 10). This tells us that the optimal advertising campaign induces two types of reactions in consumers: some consumers become completely convinced that the product is not worth its price, while others buy the product, albeit reluctantly. The basic intuition for this stems from the fact that worsening the opinion of the product by those who are already not buying it is costless to the firm, and yet it increases the probability of more favorable reactions by others (because of Bayes-plausibility). Hence, those who do not buy the product are driven to complete certainty that it is not worth buying. On the other hand, improving the opinion of the product by those who are already willing to buy it does not generate further sales, and yet it decreases the likelihood of such favorable impressions. Hence, the optimal advertising campaign makes all buyers marginal. In this example, vˆ is difficult to visualize but since payoffs depend only on the expected state, we   can depict ve and Ve (Figure 6). Since Ve Eµ0 [ω] > ve Eµ0 [ω] , Proposition 6 tells us there exists an advertising campaign that increases firm’s revenue. Also, as we discussed at the end of Section  5, while we cannot determine the optimal campaign by examining ve, we know that Ve Eµ0 [ω] is an upper bound on the market share that can be achieved by any advertising campaign.

32

Ž V HE Μ @ΩDL

Ž vHE Μ @ΩDL

EΜ0 @ΩD

E Μ @ΩD

p

EΜ0 @ΩD

(a) function vˆ

Ž vHE Μ @ΩDL

p

E Μ @ΩD

(b) function V

Figure 6: advertising to increase sales

7

Extensions and Limitations

7.1

Dynamic Mechanisms

We have restricted our attention to a class of persuasion mechanisms where there is a single stage of communication. Sender sees only one draw of private information, and sends only one message to Receiver. In reality, however, persuasion often happens over multiple periods. Firms may send multiple ads, a prosecutor may go through successive rounds of gathering and reporting information, and so forth. Our framework can easily accommodate one class of dynamic mechanisms. Consider an extended definition of a persuasion mechanism that has T periods with possibly different signals and messaging technologies (π t , ct ) at each stage. Sender privately observes realization s1 ∈ S1 from π 1 (·|ω) and chooses a message m1 ∈ M1 which Receiver observes. Sender then privately observes realization s2 ∈ S2 , sends message m2 ∈ M2 , observes realization s3 ∈ S3 and so on up to period T . After Receiver observes the final message mT , she chooses her action a ∈ A. We define an equilibrium of such a mechanism exactly as before. Any equilibrium of such a dynamic mechanism must still induce a distribution of posteriors τ

33

at the point that Receiver chooses her action, and the expected value of v will still be equal across all mechanisms that induce the same distribution τ . Therefore, the argument that was used to prove Proposition 1 implies that if there exists some dynamic mechanism with value v ∗ there also exists a straightforward static mechanism with value v ∗ . This means that all our core results on optimal mechanisms and the situations where Sender benefits from persuasion apply directly to this dynamic case as well. A different class of dynamic mechanisms is one where Receiver chooses an action in each period. Suppose, for example, that we modify the definition of a dynamic mechanism above and assume that after observing m1 Receiver chooses action a1 ∈ A; after observing m2 Receiver chooses action a2 ∈ A; and so on. Sender’s and Receiver’s payoffs are v (a1 , ..., aT , ω) and u (a1 , ..., aT , ω), respectively. This case is much more complicated. Sender’s expected payoff now depends not only on the distribution of Receiver’s beliefs at time T , but also on the distribution of her beliefs at each stage along the way. In fact, it depends on the joint distribution of beliefs in each period, τ ∈  ∆ ΩT . We conjecture that we can still express Sender’s expected utility as Eτ vˆ (µ1 , ..., µT ), and moreover, that it is still possible to restrict attention to something analogous to honest mechanisms, provided that Sender’s private signals can be conditioned on Receiver’s past actions. However, the construction of vˆ for any given problem is now more complex, and the Bayes-plausibility constraint for sequences of beliefs is more complicated than for a single distribution of beliefs. We have not attempted to extend either our geometric intuitions or our analytical results to this case.

7.2

Receiver’s Private Information

Extending our analysis to situations where Receiver has private information is straightforward. Given a mechanism (π, c), suppose that the timing of the game is as follows. First, nature selects ω from Ω according to µ0 .

Neither Sender nor Receiver observe nature’s move.

Then, Sender

privately observes a realization s ∈ S from π (·|ω) and chooses a message m ∈ M . Then, Receiver privately observes a realization r ∈ R from some signal χ (·|ω). Finally, Receiver observes m and chooses an action a ∈ A.

Note that if Receiver instead observes her private information before

Sender observes his signal or before he sends his message, the game can be re-formulated with the

34

timing above without loss of generality because Sender’s action is independent of Receiver’s private information (since he does not observe it) and Receiver always chooses her action after observing m. We still assume that Sender and Receiver share a prior µ0 at the outset of the game. The only way in which Receiver’s private information changes our analysis is that we can no longer construct a deterministic function a ˆ (µ) which gives Receiver’s action at any belief. Rather, for any µ, Receiver’s optimal action a ˆ (µ, r) depends on the realization of her private signal and so is stochastic from Sender’s perspective.

When his posterior is µ , Sender assigns probability

χ (r|ω) µ (ω) to the event that Receiver’s signal is r and the state is ω. Hence, Sender’s expected utility when he induces belief µ is:

vˆ (µ) =

XX

v (ˆ a (µ, r) , ω) χ (r|ω) µ (ω) .

ω∈Ω r∈R

Once we reformulate vˆ this way, our approach applies directly. In particular, our key simplifying results—Proposition 1, Corollary 1, Corollary 2—still hold. Aside from the fact that constructing vˆ is slightly more complicated, the analysis of the problem in terms of the properties of vˆ and its concave closure V proceeds exactly as before. That said, some of our characterization results, such as that Receiver’s preference is never discrete at any interior µ induced by an optimal mechanism, will no longer hold. A different type of situation that involves private information is when Receiver’s preferences depend on some parameter θ ∈ Θ which is unrelated to ω. This distinction matters because we still assume that Sender’s private signal is informative only about ω. Hence, no mechanism provides any additional information to Sender about θ. For example, in our motivating example, the judge might have unobservable, idiosyncratic aversion toward convicting an innocent defendant, which would affect the posterior cutoff at which she is willing to convict.23 The prosecutor’s investigation, however, would not provide any information about the judge’s preferences. Despite this distinction, we can handle this situation in the same way as the one before. Again, the only impact of private information is that Receiver’s optimal action a ˆ (µ, θ) is stochastic from Sender’s perspective.

Letting φ denote the distribution of θ, Sender’s expected utility when he

23

Another example of this type of private information is the value of Receiver’s outside option in Rayo and Segal (2008).

35

induces belief µ is:

vˆ (µ) =

XZ

v (ˆ a (µ, θ) , ω) dφ (θ) µ (ω) .

ω∈Ω Θ

Here again, once we thus reformulate vˆ, we can analyze the problem in terms of vˆ and V in exactly the same way as before.

7.3

Multiple Receivers

In many settings of interest—politicians persuading voters, firms advertising to consumers, auctions— our assumption that there is a single Receiver is unrealistic. Suppose there are n receivers. For ease of exposition we maintain our common prior assumption, which in this setting means that Sender and all receivers share a prior µ0 over Ω.24 Sender’s utility is now a function of each receiver’s action: v (a1 , ..., an , ω). There are two classes of multiple-receiver models where our results can be extended quite easily. The first class is one where Sender sends separate (possibly correlated) messages to each receiver, Sender’s utility is separable across receivers’ actions,25 and each receiver cares only about her own action. In this case, we can simply apply our approach separately to Sender’s problem vis-` a-vis each receiver. Since Sender’s utility is separable, each receiver sees only her own message, and no receiver cares about what other receivers are doing, we basically have n copies of our standard problem with a single Receiver. In the special case where all receivers have the same utility function, the optimal mechanism will of course be the same for each receiver, so the analysis collapses to a single problem identical to the one we have analyzed before, as in the example of section 6.3. The second class of models is where Sender can only persuade by revealing public information. That is, any message from Sender is observed by all receivers. In this case, our approach applies no matter whether receivers then choose their individual actions simultaneously, in sequence, or according to some other game. Moreover, Sender’s utility need not be separable across receivers’ actions, receivers might care about each other’s actions, and they might have heterogeneous util24

There are no additional complications from having both multiple receivers and private information on their part. The approach from the previous Subsection for dealing with private information applies equally well to the case with multiple receivers. 25 When v is not separable across the ai ’s the problem is similar to that of dynamic mechanisms where Receiver chooses an action in each period.

36

ity functions. An example of a setting like this is Milgrom and Weber’s (1982) model of public information revelation in auctions with a common-value component.26 For simplicity, consider the case where the equilibrium of the post-message game is in pure strategies.27 If the post-message game does not have a unique equilibrium, we focus on an equilibrium which yields the highest payoff to Sender, analogously to our earlier equilibrium selection rule. Let a ˆi (µ) represent the ith receiver’s equilibrium action when she has belief µ. We can then define vˆ as a function of receivers’ shared posterior µ:

vˆ (µ) ≡

X

v (ˆ a1 (µ) , ..., a ˆn (µ) , ω) µ (ω) .

ω∈Ω

With this reformulation of vˆ, our basic approach again applies. Proposition 1, Corollary 1, and Corollary 2 all still hold. Constructing vˆ is potentially much more complicated here since it involves solving for the equilibria of the post-message game, but the analysis of the problem in terms of the properties of vˆ and V is exactly the same as before. Of course, characterization results which are stated in terms of Receiver’s preferences, such as Proposition 4, would have to be reinterpreted. There is an important third class of multiple-receiver models where our results do not extend easily:

those where the receivers care about each other’s actions and Sender can send private

signals to individual receivers. The crucial problem with this case is that for a given set of beliefs that receivers hold after observing their messages, the receivers’ actions may vary as a function of the mechanism that produced those beliefs. In a common value auction for example, a bidder with a given belief will behave differently if she believes that other bidders are receiving highly informative signals than if she believes they are receiving uninformative signals. This means that the key simplifying step in our analysis—reducing the problem of finding an optimal mechanism to one of maximizing over distributions of posterior beliefs—does not apply. 26

Milgrom and Weber (1982) allow for bidders to have private information. As we mentioned earlier, the previous Subsection provides a way of incorporating that possibility. 27 If the equilibrium is in mixed strategies, the only additional complication is that actions are stochastic for a given belief, but we have already shown that this poses no problems for our approach.

37

7.4

Multiple Senders

Our model can also be used to think about settings with multiple senders.

Suppose there is a

single Receiver who receives messages from multiple senders. Receiver observes all the messages and then takes a single action. All senders and Receiver share a common prior and each sender’s utility depends on Receiver’s action and the state of the world. If we simply wish to know whether there is a an informational environment that increases some weighted function of senders’ utilities, our previous results apply directly since they do not depend on any particular interpretation of the objective function v (a, ω). Hence, we could simply let v be any weighted function of senders’ utilities.

A more interesting question is what happens if

multiple senders play a non-cooperative game. Specifically, consider a game where each sender i simultaneously28 chooses a π i : Ω → ∆ (Si ), Receiver then observes all si ’s and takes her action.29 Taking other senders’ choices as given, each sender’s problem is identical to one in Subsection 7.2 where he is the only Sender and Receiver has some private information about ω.

The private

information here is simply the set of signal realizations from all the other senders. Hence, our tools for finding an optimal mechanism provide a way to compute the best-response functions. Unless one can solve for an optimal mechanism analytically, however, solving for the equilibria of this game might be challenging. Another issue is that we can no longer speak of Receiver taking a “Senderpreferred” action from a∗ (µ).

Consequently, an optimal mechanism, and hence a best-response

function, might not always exist. To guarantee the existence of an equilibrium of this game we would need to assume there is a uniquely optimal action for Receiver at each belief.

7.5

Limited set of signals

Throughout the paper we have assumed that the mechanism designer can choose any signal π whatsoever. In many settings, this might be an unreasonable assumption. What can we say about the case where only some subset of potential signals, say Π, is feasible? We can still formulate our 28

The analysis of games where senders move sequentially is very similar. For the same reason that we can restrict our attention to honest mechanisms in our main model, this game is isomorphic to one where each sender chooses a (π, c) pair, observes a signal realization s from π, and then sends a message m to Receiver. 29

38

problem as a search over distributions of posteriors. The value of an optimal mechanism is

max Eτ vˆ (µ) τ Z s.t. µdτ (µ) = µ0 and τ ∈ Γ

where Γ denotes distributions of posteriors induced by honest mechanisms with a signal in Π. When this additional constraint binds, however, we will not necessarily be able to use our geometric approach. Knowing that V (µ0 ) > vˆ (µ0 ) does not tell us whether there is a Bayes-plausible τ ∈ Γ s.t. Eτ vˆ (µ) > vˆ (µ0 ). There is one particular type of limitation on Π, however, which our approach handles easily. Suppose that we can express the state space as Ω × Θ, denoting a particular state by (ω, θ). Moreover, suppose that the set of potential signals Π consists of all signals that provide no information about θ. Then, the situation is closely analogous to one in Subsection 7.2 where the space of signals is arbitrarily rich, but Receiver’s preferences depend on some parameter θ ∈ Θ which is unrelated to ω.

The only difference is that Receiver’s action no longer depends on θ, but Sender’s utility

v (a, (ω, θ)) now does. This poses no further complications. Sender’s utility when he induces a belief µ on Ω is simply:

vˆ (µ) =

XZ

v (ˆ a (µ) , (ω, θ)) dφ (θ) µ (ω) ,

ω∈Ω Θ

where φ denotes his prior on Θ. Again, with this reformulation of vˆ, Proposition 1, Corollary 1, and Corollary 2 all still hold. Finally, the possibility that Γ might not always include all Bayes-plausible τ ’s has important implications.

Consider a modification of our initial example.

Suppose there are two types of

prosecutors: a high-ability prosecutor who can structure his investigation so as to generate any signal π, and a low-ability prosecutor who has access to a limited set of signals. If this set does not include the optimal signal, the high-ability prosecutor will convict a higher percentage of defendants than the low-ability one even if the judge is fully aware of the difference in prosecutors’ abilities. The rationality of the judge does not imply that she will somehow compensate for the prosecutor’s 39

type.

More broadly, the benefit to Sender from expanding Γ might partly explain the observed

large expenditures on persuasive activities.

7.6

Limited Commitment

What can we say about settings where Sender is unable to commit to an honest mechanism? The first thing to note is that our results provide an upper bound on gains from communication in any persuasion mechanism.

By Proposition 1, we know that the value to Sender from being

able to communicate with Receiver can only be weakly lower when he cannot commit to an honest mechanism.

This observation has important implications even in models without commitment.

Consider, for example, Spence’s (1973) signalling model. Since the worker’s wage in that model is the expectation of his type, we know that vˆ is linear, so the worker (Sender) cannot benefit from persuasion. Hence, even without solving for the equilibria of this signalling game, we know that in any equilibrium the average worker would be weakly better off if a government policy outlawed education.30 More broadly, in any game captured by our definition of a persuasion mechanism, Sender values his ability to communicate with Receiver no more than V (µ0 ) − vˆ (µ0 ).

However, this provides

only an upper bound to the value of communication in games without commitment.

In some

settings, V (µ0 ) − vˆ (µ0 ) might be large and yet Sender might not benefit at all from an opportunity to communicate with Receiver. In the remainder of this section we analyze more directly what the gains from communication are when Sender cannot commit to an honest messaging technology but retains his choice of what information to gather. Specifically, we analyze the choice of an optimal signal when the message technology c is constant. We call such mechanisms commitment-free. If there is a commitment-free mechanism with a strictly positive gain, we will say that beneficial persuasion without commitment is possible. The analysis of commitment-free mechanisms is closely related to the questions in Green and Stokey (2007). Assuming cheap talk messaging, they also consider the benefits that might arise from reducing the informativeness of Sender’s signal.31 In contrast to their paper, which focuses on 30 31

In fact, in any equilibrium with positive signalling costs, he would be strictly better off under such policy. Note that Crawford and Sobel (1982) implicitly assume that Sender receives a fully informative signal. They

40

local improvements in informativeness, we here derive bounds on the benefit from persuasion when Sender has a choice over an arbitrary signal. The analysis here is also related to Ivanov (2008). He, however, focuses on the question of when restricting Sender’s information benefits Receiver, while we explore when Sender can be made better off by being less informed. Throughout the analysis we assume that Receiver observes what information Sender has. In a different game, where Sender chose his signal covertly, the only equilibrium would be for Sender to choose the fully informative signal and the set of equilibria would be isomorphic to those of Crawford and Sobel (1982). As in all cheap talk settings, incentive compatibility is the key obstacle to communication in commitment-free mechanisms. Sender, however, can often choose to gather information which mitigates the incentive compatibility problem: he can choose to learn information with the property that ex post his utility is maximized by truthfully reporting what he learns.

When the chosen

signal has this property, Receiver can accept Sender’s messages at face value. Consider the case where Sender’s preferences are independent of the state. In this case, Sender can credibly convey messages if and only if he is completely indifferent between the actions they induce. For example, suppose Ω = {0, 1}, A = [0, 1], µ0 = 14 , u = − (a − ω)2 , and v (a, ω) = 2 a − 14 . Here, Sender’s optimal commitment-free mechanism is a signal π on {l, h} with π (l|0) =

2 3

π (l|1) = 0

π (h|0) =

1 3

π (h|1) = 1.

This signal induces posteriors µl (1) = 0 and µh (1) = vˆ.

1 2,

which yield exactly the same value of

Hence, Sender can truthfully reveal the signal realization. By contrast, if Sender were fully

informed, Receiver would not believe his reports that µ (1) = 1 and thus Sender would not benefit from persuasion. This example is an instance of a more general result that the gain which Sender can obtain when his utility is independent of the state is equal to the distance between vˆ (µ0 ) and the greatest assume that Sender observes the value of an arbitrary random variable, but since both Sender’s and Receiver’s utilities are defined solely in terms of this random variable, the state space is effectively the set of realizations of the random variable and Sender is thus perfectly informed of the state.

41

value of vˆ that is constant across beliefs whose convex hull includes the prior. Let

E = sup {v|∃M ⊂ ∆ (Ω) s.t. vˆ (µ) = v ∀µ ∈ M and µ0 ∈ co (M )} − vˆ (µ0 ) .

Because vˆ is not necessarily continuous, the sup in the expression above may not be obtained, and hence an optimal commitment-free mechanisms might not always exist. However, we can always find a commitment-free mechanism whose value is arbitrarily close to E. Proposition 13 Suppose v is independent of ω. The gain from any commitment-free mechanism is weakly less than E.

For any ε > 0, there exists a commitment-free mechanism whose gain

exceeds E − ε. Beneficial persuasion without commitment is possible if and only if E > 0. Often, E will be equal to zero and beneficial persuasion without commitment will be possible only if v depends on the state. When v varies with ω, Sender can mitigate the incentive compatibility constraint by choosing a signal that induces posteriors at which his preferences are aligned with those of Receiver. Consider the following example. Let Ω = {ω 1 , ω 2 , ω 3 } and A = {a1 , a2 }. The beliefs where an agent is indifferent between the two actions form a straight line in the probability triangle. Suppose preferences are such that these lines are as in Figure 7. The line where Receiver is indifferent between the actions is the darker, steeper one; she prefers a1 when her beliefs are in the Southeast portion of the simplex. The line where Sender is indifferent is the one that is less dark and less steep; he prefers a1 in the Northwest portion of the simplex. For most beliefs, Sender and Receiver disagree on which action is preferable. The shaded areas indicate the small regions of agreement. Let Z denote the union of these two areas. Because they disagree on the appropriate action at the degenerate beliefs, no communication would be possible if Sender were perfectly informed. If Receiver takes both a1 and a2 in equilibrium, Sender will always send a message that induces his preferred action.

But Receiver knows that, for any degenerate

belief that Sender has, she prefers the other action. Thus, an equilibrium where both actions are taken cannot exist. If, however, Sender limits his information, both Receiver and Sender can benefit from communication. In particular whenever µ0 belongs to the convex hull of Z, but not to Z itself, beneficial persuasion without commitment is possible. For instance, Figure 7 illustrates that if Sender selects 42

a1 a2

Z Μ2 Μ0 Μ1

a2 a1

Figure 7: persuasion without committment π which induces µ1 and µ2 , he will have the incentive to truthfully reveal the signal realization. Receiver will be aware of this and both will benefit from persuasion without commitment. The gain that Sender can achieve from persuasion due to such alignment of preferences is bounded by the extent to which his preferences vary with the state. Let   D = max max v (a, ω) − min v (a, ω) a∈A

ω

ω

denote a measure of the sensitivity of v to ω. When vˆ (µ) is monotonic,32 Sender can achieve incentive compatibility only by selecting posteriors at which his preferences are aligned with Receiver’s, and the gain from any commitment-free mechanism cannot exceed D. Proposition 14 If vˆ is monotonic, the gain from any commitment-free mechanism is at most D. Since E is necessarily equal to zero when vˆ is monotonic, both Proposition 13 and Proposition 13 yield the following Corollary. Corollary 3 If vˆ is monotonic and v is state-independent, beneficial persuasion without commitment is not possible. 32

A condition weaker than monotonicity will also suffice. Say vˆ is without troughs if there do not exist µa , µb , µc and γ ∈ (0, 1) s.t. µb = γµa + (1 − γ) µc and min {ˆ v (µa ) , vˆ (µc )} > vˆ (µb ). As the proof in Appendix A shows, the subsequent Proposition and Corollary both hold if we assume vˆ is without troughs rather than monotonic.

43

8

Conclusion

There are two ways to induce a person to do something.

One is to incentivize her, whether by

increasing her reward from taking a particular action, or by increasing her punishment for doing something else.

Such incentive schemes can be blunt, as when you pay someone for performing

a task or coerce her into doing it, or more subtle, as when you increase the supply of goods complementary to an activity.

All such schemes, however, rely on changing the individual’s

marginal preferences. The other way to induce a person to do something is to change her beliefs. Changes in beliefs can change the expected action for several reasons. One is that the person being persuaded may fail to fully account for the motives of the persuader. Another is that overwhelming an individual with information may make it too difficult to process all of it an appropriate way. Yet another, which is the focus of this paper, is that even a perfectly rational Bayesian can often be persuaded.

9 9.1

Appendix A: Proofs Proof of Proposition 1

As we mentioned in the text, (2) immediately implies (1) and (3). We first show that (1) implies (2). Consider an equilibrium εo = (σ o , ρo , µo ) of some mechanism (π o , co ) with value v ∗ . For any a, let M a be the set of messages which induce a: M a = {m|ˆ a (µ) = m} in this equilibrium. Now, consider a straightforward mechanism with S = A and

π (a|ω) =

X X

σ o (m|s) π o (s|ω) .

m∈M a s∈S

In other words, in the proposed mechanism Sender “replaces” each message with a recommendation of the action that the message induced. Since a was an optimal response to each m ∈ M a in εo , it must also be an optimal response to the message a in the proposed straightforward mechanism. Hence, the distribution of Receiver’s actions conditional on the state is the same as in εo . Therefore, P P P if we set k equal to the messaging costs in εo , i.e., k = m s ω c (m|s) σ o (m|s) π o (s|ω), the

44

value of the straightforward mechanism is exactly v ∗ . Hence, (1) implies (2). It remains to show that (3) implies (1).

In other words, we need to show that given any

Bayes-plausible distribution of posteriors, there exists a mechanism that induces it. We will show a stronger claim that there exists an honest mechanism that induces it. with signal π induces τ if Supp (τ ) = {µs }s∈S and π (s|ω) µ0 (ω) for all s and ω 0 0 ω 0 ∈Ω π (s|ω ) µ0 (ω ) X (ii) τ (µs ) = π (s|ω) µ0 (ω) for all s. (i) µs (ω) =

P

ω∈Ω

Given a Bayes-plausible τ , let

π (s|ω) =

µs (ω) τ (µs ) . µ0 (ω)

Now,

π (s|ω) =

X

π (s|ω) µ0 (ω) = µs (ω) τ (µs ) ⇒ X µs (ω) τ (µs ) ⇒ π (s|ω) µ0 (ω) = ω

ω

X

µs (ω) τ (µs ) ⇒ µ0 (ω)

π (s|ω) µ0 (ω) = τ (µs ) .

ω

which establishes (ii). Moreover,

π (s|ω) = µs (ω) = µs (ω) =

µs (ω) τ (µs ) ⇒ µ0 (ω) π (s|ω) µ0 (ω) ⇒ τ (µs ) π (s|ω) µ0 (ω) P , 0 0 ω 0 ∈Ω π (s|ω ) µ0 (ω )

which establishes (i). Hence, π induces τ .

45

An honest mechanism

9.2

Proof of Proposition 2

Suppose that vˆ is concave.

That means that for any τ , Eτ (ˆ v (µ)) ≤ vˆ (Eτ (µ)).

Hence, for

any Bayes-plausible τ , Eτ (ˆ v (µ)) ≤ vˆ (µ0 ). Hence, by Corollary 1, Sender does not benefit from persuasion. Now, suppose that vˆ is convex and not concave. The fact that it is not concave means that there exists some µa and µb and λ ∈ (0, 1) s.t. vˆ (λµa + (1 − λ) µb ) < λˆ v (µa ) + (1 − λ) vˆ (µb ). Now, consider the belief µc = λµa + (1 − λ) µb . Since µ0 is not on the boundary of ∆ (Ω), there exists a belief µd and a γ ∈ (0, 1) s.t. µ0 = γµc + (1 − γ) µd . Moreover, since vˆ is convex, we know that

vˆ (µ0 ) = vˆ (γµc + (1 − γ) µd ) ≤ γˆ v (µc ) + (1 − γ) vˆ (µd ) = γˆ v (λµa + (1 − λ) µb ) + (1 − γ) vˆ (µd ) < γ (λˆ v (µa ) + (1 − λ) vˆ (µb )) + (1 − γ) vˆ (µd ) .

Now, let τ be the distribution of posteriors that puts probability γλ on µa , γ (1 − λ) on µb and (1 − γ) on µd . By construction, τ is Bayes-plausible and Eτ vˆ (µ) > vˆ (µ0 ).

9.3

Proof of Proposition 3

Suppose that ∀µ, vˆ (µ) ≤

P ω

v (ˆ a (µ0 ) , ω) µ (ω). Given an equilibrium (σ ∗ , ρ∗ , µ∗ ) of some mecha-

nism, let τ m denote the probability of message m:

τm =

XX ω

σ ∗ (m|s) π (s|ω) µ0 (ω) .

s

46

The value of the mechanism is at most:

X

τ m vˆ (µ∗ (·|m))

m∈M



X

τm

m∈M

=

X

=

X

X

v (ˆ a (µ0 ) , ω) µ∗ (ω|m)

ω

v (ˆ a (µ0 ) , ω)

ω

X

τ m µ∗ (ω|m)

m∈M

v (ˆ a (µ0 ) , ω) µ0 (ω)

ω

= vˆ (µ0 ) .

9.4

Proof of Proposition 4

Suppose there is information Sender would share and Receiver’s preference is discrete at the prior. P Since u is continuous in ω, u (ˆ a (µ0 ) , ω) µ (ω) is continuous in µ. Therefore, since Receiver’s preference is discrete at the prior, ∃δ > 0 s.t. for all µ in an δ-ball around µ0 , a ˆ (µ) = a ˆ (µ0 ). Denote this P a (µ0 ) , ω) µh (ω). ball by Bδ . Since there is information Sender would share, ∃µh s.t. vˆ (µh ) > ω v (ˆ Consider a ray from µh through µ0 . Since µ0 is not on the boundary of ∆ (Ω), there exists a belief on that ray, µl s.t. µl ∈ Bδ and µ0 = γµl + (1 − γ) µh for some γ ∈ (0, 1). Now, consider the distribuP a (µ0 ) , ω) µl (ω). tion of posteriors τ (µl ) = γ, τ (µh ) = 1 − γ. Since a ˆ (µ0 ) = a ˆ (µl ), vˆ (µl ) = ω v (ˆ Hence,

γˆ v (µl ) + (1 − γ) vˆ (µh ) X X > γ v (ˆ a (µ0 ) , ω) µl (ω) + (1 − γ) v (ˆ a (µ0 ) , ω) µh (ω) ω

=

X

ω

v (ˆ a (µ0 ) , ω) (γµl (ω) + (1 − γ) µh (ω))

ω

=

X

v (ˆ a (µ0 ) , ω) µ0 (ω)

ω

= vˆ (µ0 )

Since τ is Bayes-plausible, Sender benefits from persuasion.

47

9.5

Proof of Proposition 5

Suppose A is finite. We begin with the following Lemma: Lemma 1 If Receiver’s preference at a belief µ is not discrete, there must be an action a 6= a ˆ (µ) such that X

u (ˆ a (µ) , ω) µ (ω) =

X

u (a, ω) µ (ω) .

Proof. Suppose there is no such action. Then, we can define an ε > 0 by

ε=

nX o X 1 min u (ˆ a (µ) , ω) µ (ω) − u (a, ω) µ (ω) . 2 a6=aˆ(µ)

Since A is finite, the minimum is obtained. But then,

P

u (ˆ a (µ) , ω) µ (ω) >

P

u (a, ω) µ (ω) + ε

∀a 6= a ˆ (µ), which means that Receiver’s preference is discrete at µ. Given this Lemma, it will suffice to show that the set n o X X µ|∃a 6= a ˆ (µ) s.t. u (ˆ a (µ) , ω) µ (ω) = u (a, ω) µ (ω)

has measure zero. Note that this set is a subset of n o X X  µ|∃a, a0 s.t. a 6= a0 , u (a, ω) µ (ω) = u a0 , ω µ (ω) .

Hence, it will suffice to show that the latter set has measure zero. In fact, since there are only finitely many pairs of actions a, a0 , and since the union of a finite number of measure-zero sets has measure zero, it will suffice to show that given any distinct a and a0 the set n X o X  µ| u (a, ω) µ (ω) = u a0 , ω µ (ω)

has measure zero. Given any distinct a and a0 , index states by i and let β i = u (a, ω i ) − u (a0 , ω i ). Let β =  [β 1 , ..., β n ] and µ = [µ (ω 1 ) , ..., µ (ω n )]. We need to show that the set µ|β 0 µ = 0 has measure zero.

Recall that for any action a there exits a µ s.t. a∗ (µ) = {a}. That means that there is

necessarily an ω s.t. u (a, ω) 6= u (a0 , ω).

Hence, there is at least one β i 6= 0. 48

Therefore, β is

a linear transformation of rank 1. Hence, the kernel of β is a vector space of dimension n − 1.  Therefore, µ|β 0 µ = 0 is measure zero with respect to the Lebesgue measure on Rn .

9.6

Proof of Proposition 6

As we mentioned in footnote 11, we will establish a somewhat stronger proposition which implies Proposition 6.

Suppose that there exists a linear transformation T : ∆ (Ω) → Rk s.t. vˆ (µ) =

ve (T µ). Let Ve denote the concave closure of ve. Then, Proposition 15 Sender benefits from persuasion if and only if Ve (T µ0 ) > ve (T µ0 ). Proof. Suppose Ve (T µ0 ) > ve (T µ0 ). That implies there exists a z s.t. z > ve (T µ0 ) and k+1 s.t. (T µ0 , z) ∈ co (e v ). Hence, there exists a set (ti )k+1 i=1 w/ ti ∈ Image(T ) and weights γ ∈ ∆ P P P e (ti ) > ve (T µ0 ). For each i, select any µi from T −1 ti . Let µa = k+1 i=1 γ i µi . i γiv i γ i ti = T µ0 and P P P Note that since T is linear T µa = T i γ i µi = i γ i T µi = i γ i ti = T µ0 . Since µ0 is not on

the boundary of ∆n , there exists a belief µb and a λ ∈ (0, 1) s.t. λµa + (1 − λ) µb = µ0 . T is linear, T µb =

1 1−λ

Since

(T µ0 − λT µa ). Therefore, since T µa = T µ0 , we have T µb = T µ0 . Hence,

ve (T µ0 ) = ve (T µb ). Now, consider a mechanism that induces the distribution of posteriors τ (µi ) = λγ i for i = 1, ..., k + 1 τ (µb ) = 1 − λ

Since µa =

Pk+1 i=1

γ i µi and λµa + (1 − λ) µb = µ0 , this τ is Bayes-plausible. The value of the

mechanism that induces this τ is

X

λγ i vˆ (µi ) + (1 − λ) vˆ (µb )

i

= λ

X

γ i ve (T µi ) + (1 − λ) ve (T µb )

i

= λ

X

γ i ve (ti ) + (1 − λ) ve (T µ0 )

i

> λe v (T µ0 ) + (1 − λ) ve (T µ0 ) = ve (T µ0 ) = vˆ (µ0 ) . 49

Hence, Sender benefits from persuasion. Now suppose Ve (T µ0 ) ≤ ve (T µ0 ). For any Bayes-plausible distribution of posteriors τ , Eτ [µ] = µ0 implies Eτ [T µ] = T µ0 , so Eτ [ˆ v (µ)] = Eτ [e v (T µ)] ≤ Ve (T µ0 ) ≤ ve (T µ0 ) = vˆ (µ0 ). Hence, by Corollary 1 Sender does not benefit from persuasion.

9.7

Proof of Proposition 7

We begin the proof by showing that selection of Sender-preferred equilibria implies that vˆ is upper semicontinuous. Lemma 2 vˆ is upper semicontinuous. Proof. Suppose that vˆ is discontinuous at some µ. Since vˆ (µ) =

P

ω

v (ˆ a (µ) , ω) µ (ω) and v is

continuous, it must be that a ˆ (µ) is discontinuous at µ. Since u is continuous, by Berge’s Maximum Theorem this means that Receiver must be indifferent between a set of actions at µ, i.e., a∗ (µ) is P not a singleton. By definition, however, vˆ (µ) ≡ maxa∈a∗ (µ) ω v (a, ω) µ (ω). Hence, vˆ is upper semicontinuous. Now, let hyp (ˆ v ) denote the hypograph of vˆ, i.e., the set of points lying on or below the graph. Because, unlike the graph of vˆ, hyp (ˆ v ) is closed and connected, it will be easier construct to work with. Lemma 3 Given µ and S ⊂ hyp (ˆ v ), if (µ, V (µ)) is in the convex hull of S, it is also in the convex hull of the intersection of S and graph of vˆ. Proof. Given µ and S = {(µi , zi )}i∈I ⊂ hyp (ˆ v ), suppose (µ, V (µ)) = P

P

i∈I

γ i (µi , zi ) with

γ i = 1 and γ i ∈ [0, 1] ∀i ∈ I.

We need to show that for any γ i > 0, it must be the  case that zi = vˆ (µi ). Suppose to the contrary that ∃j ∈ I s.t. γ j > 0 and zj < vˆ µj .   P Consider the set {(µi , zi )}i∈I\{j} ∪ µj , vˆ µj ⊂ hyp (ˆ v ). Since γ j µj + i∈I\{j} γ i µi = µ and  P P γ j vˆ µj + i∈I\{j} γ i zi > i∈I γ i zi = V (µ), we have that V (µ) < sup {z| (µ, z) ∈ co (hyp (ˆ v ))} = i∈I

sup {z| (µ, z) ∈ co (ˆ v )}. Hence, we’ve reached a contradiction. Combining the two Lemmas above, we obtain the following: Lemma 4 Any element of the graph of V can be expressed as a convex combination of elements of the graph of vˆ. 50

 Proof. Since vˆ is upper semicontinuous, hyp vˆ is closed. Let H = (µ, z) ∈ hyp vˆ|z ≥ inf µ0 vˆ (µ0 ) . Since v is continuous and A is compact, vˆ is bounded so H is bounded.

Hence, since hyp vˆ is

closed, H is compact.

Hence, co (H) is compact which implies co (hyp (ˆ v )) is closed.

hyp (V ) = co (hyp (ˆ v )).

Hence, any element of the graph of V can be expressed a convex com-

bination of elements of hyp (ˆ v ).

Hence,

But then, by Lemma 3, it can also be expressed as a convex

combination of elements of the graph of vˆ. The remainder of the proof of Proposition 7 is straightforward. By Corollary 2, there can be no mechanism with value strictly greater than V (µ0 ). By Lemma 4, (µ0 , V (µ0 )) ∈ co (ˆ v ). Hence, there exists an optimal mechanism with value V (µ0 ) .

9.8

Proof of Proposition 8

It follows directly from definition of convexity and concavity that if vˆ is (strictly) concave, no disclosure is (uniquely) optimal, and if it is (strictly) convex, full disclosure is (uniquely) optimal. To establish the last part of Proposition 8, we begin with a key Lemma which will also be useful for establishing several other Propositions. Lemma 5 If µ0 is induced by an optimal mechanism, V (µ0 ) = vˆ (µ0 ). Proof. Suppose that an optimal mechanism induces τ and there is some µ0 s.t. τ (µ0 ) > 0 and V (µ0 ) > vˆ (µ0 ).

Since (µ0 , V (µ0 )) ∈ co (ˆ v ), there exists a distribution of posteriors τ 0 such that

Eτ 0 µ = µ0 and Eτ 0 vˆ (µ) = V (µ0 ). But then we can then take all the weight from µ0 and place it on τ 0 which would yield higher value while preserving Bayes-plausability. Formally, consider the distribution of posteriors    τ (µ0 ) τ 0 (µ) if µ ∈ Supp (τ 0 ) \Supp (τ )    τ ∗ (µ) = τ (µ) + τ (µ0 ) τ 0 (µ) if µ ∈ Supp (τ 0 ) ∩ Supp (τ )      τ (µ) if µ ∈ Supp (τ ) \ (Supp (τ 0 ) ∪ {µ0 }) . By construction, τ ∗ is plausible and yields a higher value than τ does. Suppose vˆ is not concave and an optimal mechanism induces an interior µm . we know V (µm ) = vˆ (µm ).

By Lemma 5,

Since vˆ is not concave, there is some belief µl s.t. V (µl ) > vˆ (µl ). 51

Because µm is interior, we know there is a µr and γ ∈ (0, 1) s.t. µm = γµl + (1 − γ) µr . Now,

vˆ (µm ) = V (µm ) ≥ γV (µl ) + (1 − γ) V (µr )

(by concavity of V )

≥ γV (µl ) + (1 − γ) vˆ (µr ) > γˆ v (µl ) + (1 − γ) vˆ (µr )

which means that vˆ is not convex.

9.9

Proof of Proposition 9

Since vˆ is bounded, hyp (ˆ v ) is path-connected.

Therefore, it is connected.

The Fenchel-Bunt

Theorem (Hiriart-Urruty and Lemar´echal 2004, Thm 1.3.7) states that if S ⊂ Rn has no more than n connected components (in particular, if S is connected), then any x ∈ co (S) can be expressed as a convex combination of n elements of S. Hence, since hyp (ˆ v ) ⊂ R|Ω| , any element of co (hyp (ˆ v )) can be expressed as a convex combination of |Ω| elements of hyp (ˆ v ). In particular, (µ0 , V (µ0 )) ∈ co (hyp (ˆ v )) can be expressed a convex combination of |Ω| elements of hyp (ˆ v ). But then, by Lemma 3 this further implies that (µ0 , V (µ0 )) can be expressed as a convex combination of |Ω| elements of the graph of vˆ.

Hence, there exists an optimal straightforward mechanism which induces a

distribution of posteriors whose support has no more than |Ω| elements. Since Receiver takes only a single action at any of her beliefs, this implies there exists an optimal straightforward mechanism in which the number of actions Receiver takes with positive probability is at most |Ω|.

9.10

Proof of Proposition 10

Suppose that v (a, ω) < v (a, ω) for all ω and all a 6= a. For any ω, let µω denote the distribution s.t. µω (ω) = 1. Let Ω− be the set of states where a is the uniquely optimal action for Receiver, i.e., Ω− = {ω|ˆ a (µω ) = a}. Let Ω+ be the complement of Ω− . Now, suppose contrary to Proposition 10 that an optimal mechanism induces τ and there is a

52

belief µ0 s.t. τ (µ0 ) > 0 , a ˆ (µ0 ) = a and ∃ω ∈ Ω+ s.t. µ0 (ω) > 0. Consider the following two beliefs



  

+

    

µ (ω) =

µ (ω) =

 

P

P

µ0 (ω) µ0 (ω 0 )

ω 0 ∈Ω−

if ω ∈ Ω−

0

if ω ∈ Ω+

0

if ω ∈ Ω−

µ0 (ω) µ0 (ω 0 )

ω 0 ∈Ω+

if ω ∈ Ω+ .

It is easy to see that µ0 is a convex combination of µ− and µ+ . Hence, we can “replace” µ0 with µ− and µ+ , and since a ˆ (µ− ) = a ˆ (µ0 ) while µ+ induces an action Sender prefers over a, this will yield a higher value. Formally, consider the following alternative distribution of beliefs: 

 τ∗ µ

 −

= 

X

 0

  µ0 ω  τ µ0 + τ µ−

ω 0 ∈Ω−

 τ∗ µ

 +

= 

 X

 0

  µ 0 ω  τ µ0 + τ µ+

ω 0 ∈Ω+

 τ ∗ (µ) = τ (µ) if µ ∈ / µ0 , µ− , µ+ .

Simple algebra reveals that τ ∗ is Bayes-plausible and yields a higher value than τ does.

9.11

Proof of Proposition 11

We first prove a preliminary lemma. Lemma 6 Suppose µl and µr are induced by an optimal mechanism and µm = γµl + (1 − γ) µr for some γ ∈ [0, 1]. Then, vˆ (µm ) ≤ γˆ v (µl ) + (1 − γ) vˆ (µr ). Proof. Suppose to the contrary that τ is induced by an optimal mechanism, τ (µl ), τ (µr ) > 0, and vˆ (µm ) > γˆ v (µl ) + (1 − γ) vˆ (µr ). Then we can take some weight from µl and µr and place it on µm which would yield higher value while preserving Bayes-plausability. Formally, pick any

53

ε ∈ (0, 1). Let ε0 = ετ (µl ) /τ (µr ). Consider an alternative τ ∗ defined by: τ ∗ (µl ) = (1 − γε) τ (µl ) τ ∗ (µr ) =

 1 − (1 − γ) ε0 τ (µr )

τ ∗ (µm ) = τ (µm ) + ετ (µl ) τ ∗ (µ) = τ (µ) if µ ∈ / {µl , µm , µr } . Simple algebra reveals that τ ∗ is Bayes-plausible and yields a higher value than τ does. P Say that action a is induced-dominant if ∀µ, vˆ (µ) ≤ v (a, ω) µ (ω). Say that a is strictly ω P v (a, ω) µ (ω). Say that a is weakly but not strictly induced-dominant if ∀µ s.t. a 6= a ˆ (µ), vˆ (µ) < ω P dominant (wnsd) if it is induced-dominant and ∃µ s.t. a 6= a ˆ (µ) and vˆ (µ) = v (a, ω) µ (ω) . Note ω

that there is information Sender would share if and only if a ˆ (µ0 ) is not induced-dominant, and that Assumption 1 states that there are no wnsd actions. We now prove the proposition in two steps. Lemma 7 Suppose that Assumption 1 holds. Let µ be an interior belief induced by an optimal mechanism.

Then, either: (i) Receiver’s preference at µ is not discrete, or (ii) a ˆ (µ) is strictly

induced-dominant. Proof. Suppose that Assumption 1 holds and µ is an interior belief induced by an optimal mechanism. Now, suppose Receiver’s preference at µ is discrete. By Proposition 4, we know that if µ were the prior either: (i) there would be no information Sender would want to share, i.e., a ˆ (µ) is induced dominant; or (ii) Sender would benefit from persuasion. But, Sender would not benefit from persuasion if µ were the prior because by Lemma 5 we know V (µ) = vˆ (µ). Thus, a ˆ (µ) is induced-dominant so by Assumption 1 it is strictly induced-dominant. Lemma 8 Suppose Sender benefits from persuasion, µ is an interior belief induced by an optimal mechanism, and a ˆ (µ) is strictly induced-dominant. Then, Receiver’s preference at µ is not discrete. Proof. Suppose Sender benefits from persuasion, µ is an interior belief induced by an optimal mechanism, and a ˆ (µ) is strictly induced-dominant. First note that the set of beliefs that induces

54

any particular action is necessarily convex.

Hence, when Sender benefits from persuasion, any Therefore, there must be a µ0

optimal mechanism must induce at least two distinct actions. induced by the mechanism at which a ˆ (µ) 6= a ˆ (µ0 ).

Now, suppose contrary to the Lemma that

Receiver’s preference at µ is discrete. Then, there is an ε > 0 s.t. a ˆ (εµ0 + (1 − ε) µ) = a ˆ (µ). Let µm = εµ0 + (1 − ε) µ. Since both µ and µ0 are induced by an optimal mechanism, Lemma 6 tells us that  vˆ (µm ) ≤ εˆ v µ0 + (1 − ε) vˆ (µ) .

(3)

But,

vˆ (µm ) =

X

v (ˆ a (µm ) , ω) µm (ω)

ω

=

X

v (ˆ a (µ) , ω) µm (ω)

ω

= ε

X

v (ˆ a (µ) , ω) µ0 (ω) + (1 − ε)

ω

= ε

X

X

v (ˆ a (µ) , ω) µ (ω)

ω

v (ˆ a (µ) , ω) µ0 (ω) + (1 − ε) vˆ (µ)

ω

Hence, Equation (3) is equivalent to

X

 v (ˆ a (µ) , ω) µ0 (ω) ≤ vˆ µ0 ,

ω

which means a ˆ (µ) is not strictly induced-dominant. Combining these two lemmata, we know that if Assumption 1 holds, Sender benefits from persuasion, and µ is an interior belief induced by an optimal mechanism, Receiver’s preference at µ is not discrete.

9.12

Proof of Proposition 12

Suppose Assumption 1 holds, vˆ is monotonic, A is finite, and Sender benefits from persuasion. Now, suppose µ is an interior belief induced by an optimal mechanism. Since Assumption 1 holds and Sender benefits from persuasion, Receiver’s preference at µ is not discrete by Proposition 11. Therefore, Lemma 1 tells us ∃a such that Eµ u(ˆ a (µ) , ω) = Eµ u (a, ω) . If (ii) does not hold, we

55

know Eµ v(ˆ a (µ) , ω) > Eµ v(a, ω). Therefore, a ˆ (µ) % a. Hence, given any µ0 C µ,

0 = Eµ u(ˆ a (µ) , ω) − Eµ u (a, ω) > Eµ0 u (ˆ a (µ) , ω) − Eµ0 u (a, ω) .

Since Eµ0 u (a, ω) > Eµ0 u (ˆ a (µ) , ω), we know that a ˆ (µ) is not Receiver’s optimal action when her beliefs are µ. Hence, for any µ0 C µ, a ˆ (µ0 ) 6= a ˆ (µ), which means that µ is a worst belief inducing a ˆ (µ).

9.13

Proof of Proposition 13

We first show that no commitment-free mechanism can have a gain greater than E. Suppose the contrary.

That means that in equilibrium of a mechanism, Sender conveys two messages with

positive probability m and m0 s.t. vˆ (µ∗ (·|m)) > vˆ (µ∗ (·|m0 )).

But then, since v is independent of

ω, it must be the case that Sender would profit by a deviation that always sends m instead of m0 . Next, we show that for any ε > 0, there is a commitment-free mechanism whose gain exceeds E − ε. Given ε, choose v > E − ε + vˆ (µ0 ) s.t. ∃M ⊂ ∆ (Ω) s.t. vˆ (µ) = v ∀µ ∈ M and µ0 ∈ co (M ). By definition of E, such a v exists. Now, since µ0 ∈ co (M ), there exists a Bayes-plausible τ s.t. supp (τ ) ⊂ M .

Consider a commitment-free mechanism with the signal π that induces this τ .

Since vˆ (µ) = v and v is independent of ω, incentive compatibility does not bind. Hence, the value of this mechanism is v so its gain is strictly greater than E − ε.

9.14

Proof of Proposition 14

Even though it is not feasible for Sender to commit to an honest mechanism, he still may tell truth in equilibrium.  σ ∗ (m|s) = 10 ifif m=s m6=s .

We say an equilibrium of a commitment-free mechanism is truthful if Given any equilibrium of any commitment-free mechanism, there exists

a truthful equilibrium of some commitment-free mechanism that generates the same distribution To see this, note that if some commitment-free mechanism π o has an  equilibrium with a message strategy σ o , then σ (m|s) = 10 ifif m=s m6=s is an equilibrium strategy if P o π (m|ω) = s σ (m|s) π o (s|ω). Moreover, this π and σ generate the same conditional distribuof actions and beliefs.

tion of messages, and thus of actions and beliefs, as π o and σ o . Hence, we can restrict our attention

56

to truthful equilibria without loss of generality. With that observation, we establish the following Lemma: Lemma 9 Consider any equilibrium of a commitment-free mechanism (σ ∗ , ρ∗ , µ∗ ). For any two messages m and m0 sent with positive probability: Z

Z



vm − vm0 ≤ max

v (a, ω) dρ∗ (a|m)

v (a, ω) dρ (a|m) − min

ω

ω

A

A

Proof. Supposing, w.l.o.g. that the equilibrium is truthful, for any two messages m and m0 sent with positive probability it must be the case that

vm 0 ≥

XZ ω

v (a, ω) dρ∗ (a|m) µ∗ ω|m0



A

This implies

vm − vm0

≤ vm −

XZ ω

=

XZ ω

=



A

v (a, ω) dρ∗ (a|m) µ∗ (ω|m) −

A

XZ ω

v (a, ω) dρ∗ (a|m) µ∗ ω|m0

XZ ω

v (a, ω) dρ∗ (a|m) µ∗ ω|m0

A

  v (a, ω) dρ∗ (a|m) µ∗ (ω|m) − µ∗ ω|m0 .

A

Now, note that X  µ∗ (ω|m) − µ∗ ω|m0 = 0 ω

Let Ω+ = {ω ∈ Ω|µ∗ (ω|m) > µ∗ (ω|m0 )} and Ω− = {ω ∈ Ω|µ∗ (ω|m) < µ∗ (ω|m0 )} . Then

0<

X

X   µ∗ (ω|m) − µ∗ ω|m0 = µ∗ ω|m0 − µ∗ (ω|m) ≤ 1. ω∈Ω−

ω∈Ω+

57



Hence,

vm − vm 0



XZ

  v (a, ω) dρ∗ (a|m) µ∗ (ω|m) − µ∗ ω|m0

A

=

ω X Z

  v (a, ω) dρ∗ (a|m) µ∗ (ω|m) − µ∗ ω|m0

A

ω∈Ω+

X Z

+

A

ω∈Ω−

=

X Z

X Z ω∈Ω−



  v (a, ω) dρ∗ (a|m) µ∗ (ω|m) − µ∗ ω|m0

A

ω∈Ω+



  v (a, ω) dρ∗ (a|m) µ∗ (ω|m) − µ∗ ω|m0

   v (a, ω) dρ∗ (a|m) µ∗ ω|m0 − µ∗ (ω|m)

A

 Z X   ∗   0 max v a, ω dρ (a|m) µ∗ (ω|m) − µ∗ ω|m0 0 ω

ω∈Ω+

A

 Z X   ∗    0 − min v a, ω dρ (a|m) µ∗ ω|m0 − µ∗ (ω|m) 0 ω∈Ω−

ω

A

  Z Z  ∗  ∗ 0 0 = max v a, ω dρ (a|m) − min v a, ω dρ (a|m) ω0 ω0 A X A  × µ∗ (ω|m) − µ∗ ω|m0 ω∈Ω+

Z ≤ max 0 ω

A

0



Z



v a, ω dρ (a|m) − min 0 ω

 v a, ω 0 dρ∗ (a|m)

A

The key implication of Lemma 9 is the following. Given an equilibrium of a commitment-free mechanism, let M be the set of all messages sent with a positive probability in that equilibrium. Then, Lemma 10 If in equilibrium of a mechanism ∃m ∈ M s.t. vm ≤ vˆ (µ0 ), then the gain from that mechanism is at most D. Proof. Consider an equilibrium of a commitment-free mechanism (σ ∗ , ρ∗ , µ∗ ). Suppose ∃m ∈

58

M s.t. vm ≤ vˆ (µ0 ). Then, by Lemma 9

vm0 − vˆ (µ0 ) ≤ vm0 − vm Z Z ∗ v (a, ω) dρ∗ (a|m) v (a, ω) dρ (a|m) − min ≤ max ω ω A A Z Z ∗ min v (a, ω) dρ∗ (a|m) max v (a, ω) dρ (a|m) − ≤ ω ω A A Z   = max v (a, ω) − min v (a, ω) dρ∗ (a|m) ω ω A   ≤ max max v (a, ω) − min v (a, ω) a∈A

ω

ω

= D.

Now, say vˆ is without troughs if there do not exist µa , µb , µc and γ ∈ (0, 1) s.t. µb = γµa + (1 − γ) µc and min {ˆ v (µa ) , vˆ (µc )} > vˆ (µb ). Lemma 11 Consider any belief µ0 and any distribution of beliefs τ with finite support s.t. Eτ [µ] = µ0 . If vˆ is without troughs, then ∃µ ∈ Supp (τ ) s.t. vˆ (µ) ≤ vˆ (µ0 ). Proof. Suppose vˆ is without troughs.

Given any belief µ0 and any distribution of beliefs τ

with finite support s.t. Eτ [µ] = µ0 , let k = |Supp (τ )|. We prove the Lemma by induction on k. If k = 2, the desired conclusion is immediate. Now, suppose the Lemma holds for k = l. We need to show it holds for k = l + 1. Pick any µ e in Supp (τ ). Then, µ0 =

X

τ (µ) µ = τ (e µ) µ e + (1 − τ (e µ))

X µ∈Supp(τ )\{e µ}

µ∈Supp(τ )

Now, suppose vˆ (e µ) > vˆ (µ0 ). Then, since vˆ is without troughs, we know vˆ

τ (µ) µ. 1 − τ (e µ) P

τ (µ) µ∈Supp(τ )\{e µ} 1−τ (e µ) µ





But, since the Lemma holds for k = l = |Supp (τ ) \ {e µ}|, we know there exists µ e0 ∈ P   τ (µ) Supp (τ ) \ {e µ} s.t. vˆ µ e0 ≤ vˆ µ ≤ vˆ (µ0 ). µ∈Supp(τ )\{e µ} 1−τ (e µ)

vˆ (µ0 ).

Lemma 11 leads to the following result: Proposition 16 If vˆ is without troughs, the gain from any commitment-free mechanism is at most D. 59

Proof. Suppose vˆ is without troughs and consider an equilibrium (σ ∗ , ρ∗ , µ∗ ) of any commitmentfree mechanism. Let τ m denote the equilibrium probability of message m:

τm =

XX ω

We know that µ0 (·) =

P

m τ mµ

∗ (·|m).

σ ∗ (m|s) π (s|ω) µ0 (ω) .

s

Therefore, by Lemma 11, ∃m ∈ M s.t. vm ≤ vˆ (µ0 ).

Hence, by Lemma 10, the gain from the mechanism is at most D. Since vˆ is necessarily monotonic if it is without troughs, Proposition 14 is a corollary of Proposition 16.

10

Appendix B: Extension to infinite state spaces

In the main body of the paper, we assumed that Ω is finite. We also claimed this assumption was made primarily for expositional convenience. In this appendix, we show that the approach used in the paper extends to the case when Ω is a compact metric space.33 As before, Receiver has a continuous utility function u (a, ω) that depends on her action a ∈ A and the state of the world ω ∈ Ω. Sender has a continuous utility function v (a, ω) that depends on Receiver’s action and the state of the world.

The action space A is assumed to be compact

and the state space Ω is assumed to be a compact metric space. Let ∆ (Ω) denote the set of Borel probabilities on Ω, a compact metric space in the weak* topology. Sender and Receiver share a prior µ0 ∈ ∆ (Ω). A persuasion mechanism is a combination of a signal and a message technology. A signal (π, S) consists of a compact metric realization space S and a measurable function π : [0, 1] → Ω × S, x 7→ (π 1 (x) , π 2 (x)). π 2 (x).

Assume that x is uniformly distributed on [0, 1] and that Sender observes

We denote a realization of π 2 (x) by s.

Note that since S is a compact metric space

(hence, complete and separable), there exists a regular conditional probability (i.e., a posterior probability) obtained by conditioning on π 2 (x) = s (Shiryaev 1996, p.230). A message technology  c consists of a message space M and a family of functions c (·|s) : M → R+ s∈S . As before, a  k if s=m mechanism is honest if M = S and c (m|s) = ∞ if s6=m for some k ∈ R+ . A persuasion mechanism 33

We are very grateful to Max Stinchcombe for help with this extension.

60

defines a game just as before.

Perfect Bayesian equilibrium is still the solution concept and we

still select Sender-preferred equilibria. Definitions of value and gain are same as before. Let a∗ (µ) denote the set of actions optimal for Receiver given her beliefs are µ ∈ ∆ (Ω): Z



a (µ) ≡ arg max a

u (a, ω) dµ (ω) .

Note that a∗ (·) is an upper hemicontinuous, non-empty valued, compact valued, correspondence from ∆ (Ω) to A. Let vˆ (µ) denote the maximum expected value of v if Receiver takes an action in a∗ (µ): Z vˆ (µ) ≡ max

v (a, ω) dµ (ω) .

a∈a∗ (µ)

Since a∗ (µ) is non-empty and compact and

R

v (a, ω) dµ (ω) is continuous in a, vˆ is well defined.

We first show that the main ingredient for the existence of an optimal mechanism, namely the upper semicontinuity of vˆ, remains true in this setting. Lemma 12 vˆ is upper semicontinuous. Proof. Given any a, the random variable v (a, ω) is dominated by the constant random variable maxω v (a, ω) (since v is continuous in ω and Ω is compact, the maximum is attained). Hence, by R the Lebesgue’s Dominated Convergence Theorem, v (a, ω) dµ (ω) is continuous in µ for any given a. Now, suppose that vˆ is discontinuous at some µ. Since u is continuous, by Berge’s Maximum Theorem this means that Receiver must be indifferent between a set of actions at µ, i.e., a∗ (µ) is R not a singleton. By definition, however, vˆ (µ) ≡ maxa∈a∗ (µ) v (a, ω) dµ (ω). Hence, vˆ is upper semicontinuous. Now, a distribution of posteriors, denoted by τ , is an element of the set ∆ (∆ (Ω)), the set of Borel probabilities on the compact metric space ∆ (Ω). We say a distribution of posteriors τ is R Bayes-plausible if ∆(Ω) µdτ (µ) = µ0 . We say that π induces τ if conditioning on π 2 (x) = s gives posterior κs and the distribution of κπ2 (x) is τ given that x is uniformly distributed. Since Ω is a compact metric space, for any Bayes-plausible τ there exists a π that induces it.34 Hence, the 34

Personal communication with Max Stinchcombe. Detailed proof available upon request.

61

problem of finding an optimal mechanism is equivalent to solving Z max vˆ (µ) dτ (µ) τ ∈∆(∆(Ω)) ∆(Ω) Z µdτ (µ) = µ0 . s.t. ∆(Ω)

Now, let V ≡ sup {z| (µ, z) ∈ co (hyp (ˆ v ))} , where co (·) denotes the convex hull and hyp (·) denotes the hypograph. Recall that given a subset K of an arbitrary vector space, co (K) is defined as ∩ {C|K ⊂ C, C convex}.

Let g (µ0 ) denote

the subset of ∆ (∆ (Ω)) that generate the point (µ0 , V (µ0 )), i.e., ( g (µ0 ) ≡

Z

τ ∈ ∆ (∆ (Ω)) |

Z µdτ (µ) = µ0 ,

∆(Ω)

) vˆ(µ)dτ (µ) = V (µ0 ) .

∆(Ω)

Note that we still have not established that g (µ0 ) is non-empty. That is the primary task of the proof of our main proposition. Proposition 17 Optimal mechanism exists. Value of an optimal mechanism is V (µ0 ). Sender benefits from persuasion iff V (µ0 ) > vˆ (µ0 ). An honest mechanism with a signal that induces an element of g (µ0 ) is optimal. Proof. By construction of V , there can be no mechanism with value strictly greater than V (µ0 ).

We need to show there exists a mechanism with value equal to V (µ0 ), or equivalently,

that g (µ0 ) is not empty. Without loss of generality, suppose the range of v is [0, 1]. Consider the set H = {(µ, z) ∈ hyp (V ) |z ≥ 0}. Since vˆ is upper semicontinuous, H is compact. By construction Therefore, by Choquet’s Theorem (e.g., Phelps 2001), for any (µ0 , z 0 ) ∈ H, R there exists a probability measure η s.t. (µ0 , z 0 ) = H (µ, z) dη (µ, z) with η supported by extreme R points of H. In particular, there exists η s.t. (µ0 , V (µ0 )) = H (µ, z) dη (µ, z) with η supported

of V , H is convex.

by extreme points of H. Now, note that if (µ, z) is an extreme point of H, then V (µ) = vˆ (µ); R moreover, if z > 0, z = V (µ) = vˆ (µ). Hence, we can find an η s.t. (µ0 , V (µ0 )) = H (µ, z) dη (µ, z) with support of η entirely within {(µ, vˆ (µ)) |µ ∈ ∆ (Ω)}. Therefore, there exists a τ ∈ g (µ0 ). 62

References Aumann, Robert J, & Maschler, Michael B. 1995. Repeated Games with Incomplete Information. MIT Press. Benabou, Roland, & Tirole, Jean. 2002. Self-confidence and personal motivation. Quarterly Journal of Economics, 117(3), 871–915. Benabou, Roland, & Tirole, Jean. 2003. Intrinsic and extrinsic motivation. Review of Economic Studies, 70, 489–520. Benabou, Roland, & Tirole, Jean. 2004. Willpower and personal rules. Journal of Political Economy, 112(4), 848–885. Bodner, Ronit, & Prelec, Drazen. 2003. Self-signaling and diagnostic utility in everyday decision making. Pages 105–123 of: Brocas, Isabelle, & Carrillo, Juan D. (eds), The Psychology of Economic Decisions. Oxford: Oxford University Press. Brocas, Isabelle, & Carrillo, Juan D. 2007. Influence through ignorance. RAND Journal of Economics, 38, 931–947. Caillaud, Bernard, & Tirole, Jean. 2007. Consensus building: How to persuade a group. American Economic Review, 97(5), 1877–1900. Cain, Daylian M., Loewenstein, George, & Moore, Don A. 2005. The dirt on coming clean: Perverse effects of disclosing conflicts. Journal of Legal Studies, 34(1), 1–25. Carillo, Juan D., & Mariotti, Thomas. 2000. Strategic ignorance as a self-disciplining device. Review of Economic Studies, 67(3), 529–544. Crawford, Vincent, & Sobel, Joel. 1982. Strategic information transmission. Econometrica, 50(6), 1431–1451. Ettinger, David, & Jehiel, Philippe. forthcoming. A theory of deception. American Economic Journal: Microeconomics.

63

Glazer, Jacob, & Rubinstein, Ariel. 2004. On optimal rules of persuasion. Econometrica, 72, 1715–1736. Glazer, Jacob, & Rubinstein, Ariel. 2006. A study in the pragmatics of persuasion. Theoretical Economics, 1, 395–410. Green, Jerry R., & Stokey, Nancy L. M. 2007. A two-person game of information transmission. Journal of Economic Theory, 135(1), 90–104. Grossman, Sanford J. 1981. The informational role of warranties and private disclosure about product quality. Journal of Law and Economics, 24(3), 461–483. Grossman, Sanford J, & Hart, Oliver D. 1983. An analysis of the principal-agent problem. Econometrica, 51(1), 7–45. Grossman, Sanford J, & Hart, Oliver D. 1986. The costs and benefits of ownership: A theory of vertical and lateral integration. Journal of Political Economy, 94(4), 691–719. Hart, Oliver, & Moore, John. 1990. Property rights and the nature of the firm. Journal of Political Economy, 98(6), 1119–58. Hiriart-Urruty, Jean-Baptiste, & Lemarechal, Claude. 2004. Fundamentals of convex analysis. Springer. Holmstrom, Bengt. 1979. Moral hazard and observability. Bell Journal of Economics, 10(1), 74–91. Ivanov, Maxim. 2008. Informational control and organizational design. Working Paper. Johnson, Justin P., & Myatt, David P. 2006. On the simple economics of advertising, marketing, and product design. American Economic Review, 96(3), 756–784. Jovanovic, Boyan. 1982. Truthful disclosure of information. Bell Journal of Economics, 13(1), 36–44. Kartik, Navin. forthcoming. Strategic communication with lying costs. Review of Economic Studies. Lazear, Edward P. 2006. Speeding, terrorism, and teaching to the test. Quarterly Journal of Economics, 121(3), 1029–1061. 64

Lewis, Tracy R., & Sappington, David E. M. 1994. Supplying information to facilitate price discrimination. International Economic Review, 35(2), 309–327. McCloskey, Donald, & Klamer, Arjo. 1995. One quarter of GDP is persuasion. American Economic Review Papers and Proceedings, 85(2), 191–195. Milgrom, Paul. 1981. Good news and bad news: Representation theorems and applications. The Bell Journal of Economics, 12(2), 380–391. Milgrom, Paul, & Roberts, John. 1986. Relying on the information of interested parties. RAND Journal of Economics, 17(1), 18–32. Milgrom, Paul R, & Weber, Robert J. 1982. A theory of auctions and competitive bidding. Econometrica, 50(5), 1089–1122. Mullainathan, Sendhil, Schwartzstein, Joshua, & Shleifer, Andrei. 2008. Coarse thinking and persuasion. Quarterly Journal of Economics, 123(2), 577–619. Myerson, Roger B. 1979. Incentive compatability and the bargaining problem. Econometrica, 47(1), 61–73. Ostrovsky, Michael, & Schwarz, Michael. 2008. Information dislosure and unraveling in matching markets. Working Paper. Phelps, Robert R. 2001. Lectures on Choquet’s Theorem. Springer. Prendergast, Canice. 1992. The insurance effect of groups. International Economic Review, 33(3), 567–81. Rayo, Luis, & Segal, Ilya. 2008. Optimal information disclosure. Working Paper. Shin, Hyun Song. 2003. Disclosures and asset returns. Econometrica, 71(1), 105–33. Shiryaev, A. N. 1996. Probability. Springer. Shmaya, Eran, & Yariv, Leeat. 2009. Foundations for Bayesian updating. Working Paper. Spence, A. Michael. 1973. Job market signaling. Quarterly Journal of Economics, 87(3), 355–374. 65

Taub, Bart. 1997. Dynamic agency with feedback. RAND Journal of Economics, 28(3), 515–543.

66