1 Multi-Item Auctions Defying Intuition? 1 CONSTANTINOS DASKALAKIS Massachusetts Institute of Technology The best way to sell n items to a buyer who values each of them independently and uniformly + 1] is to bundle them together, as long as c is large enough. Still, for any c,c , the randomly in [ c grand bundling mechanism is never optimal for large enough n , despite the sharp concentration of the buyer’s total value for the items as grows. Optimal multi-item mechanisms are rife with n unintuitive properties, making multi-item generalizations of Myerson’s celebrated mechanism a daunting task. We survey recent work on the structure and computational complexity of revenue- optimal multi-item mechanisms, providing structural as well as algorithmic generalizations of Myerson’s result to multi-item settings. Theory of Computation ]: General Categories and Subject Descriptors: F.0 [ General Terms: Algorithms, Economics, Theory Additional Key Words and Phrases: Auctions, Multidimensional Mechanism Design 1. INTRODUCTION Optimal mechanism design is a problem with important applications and deep mathematical structure. In its basic formulation, studied in this survey, a seller has n m interested buyers. Each buyer knows his own values for items to sell to the items, but the seller and the other buyers only know a distribution from which these values are assumed to be drawn. The goal is to design a sales procedure, mechanism , that optimizes the expected revenue of the seller. called a The basic version of the problem and its myriad extensions have familiar appli- cations. Here are a few quick ones: When auction-houses sell items, this is the problem that they face. This is also the problem that governments face when auc- tioning a valuable public resource such as wireless spectrum. Finally, the problem arises every millisecond as auctions are used in sponsored search and the allocation of banner advertisements. When it comes to selling a single item, optimal mechanism design is really well understood. Building on Myerson’s celebrated work [Myerson 1981], it has been studied intensely for decades in both Economics and Computer Science. This re- search has revealed surprisingly elegant structure in the optimal mechanism, as well as robustness to the details of the distributions, and has had a deep impact in the broader field of mechanism design. While all this progress has been taking place on the single-item front, the multi- item version of the problem has remained poorly understood. Despite substantial research effort, it is not even known how to optimally sell two items to one buyer. On the contrary, multi-item auctions appear to have very rich structure, often 1 Supported by a Microsoft Research faculty fellowship, NSF Awards CCF-0953960 (CAREER) and CCF-110149, and ONR grant N00014-12-1-0999. Author’s address: [email protected] ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

2 42 · C. Daskalakis exhibiting unintuitive properties. In this survey, we present recent progress on the structure and computation of optimal multi-item auctions. Our survey will focus on the philosophy and intuition behind the results, but we will also give substantial technical detail. Our intention is not to present a complete account of results on multi-item mechanisms, but to motivate what we view as a fresh perspective on the problem. In Section 2, we take a brief tour of the wondrous land of single-item Structure. problems, where we discuss the surprising simplicity of optimal mechanisms. Then, in Section 3, we present multi-item examples showing various ways in which the simplicity of optimal single-item mechanisms fails to generalize. A few of these examples are particularly striking as they illustrate quite unintuitive properties of optimal multi-item mechanisms. The structure of these mechanisms appears so rich that we go one step back, in Section 4, discussing approaches that may be able to accommodate this richness. Then, in Section 5, we present a duality based approach to the multi-item problem, showing how it can be used to characterize single-bidder mechanisms. We also show how this framework can be used to demys- tify the examples of Section 3, which all pertain to a single bidder. In Section 6, we turn to computation, presenting computationally efficient algorithms for the multi- item multi-bidder problem. As a byproduct, these algorithmic results offer a crisp characterization of the structure of optimal multi-item multi-bidder mechanisms. Finally, in Section 7, we wrap up with a short summary and future directions. THE WONDROUS MYERSON-LAND 2. Consider the task of selling an item to a single buyer with the goal of maximizing the seller’s revenue. The following is a well-known fact. The optimal way Fact 1 [Myerson 1981; Riley and Zeckhauser 1983]. to sell one item to a buyer whose value for the item is drawn from some known distribution F is a take-it-or-leave-it offer of the item at some price in arg max { x · (1 F ( x )) } . − Example 1. The optimal way to sell one item to a buyer whose value for the item is uniformly distributed in [0 , 1] is to price the item at 0 . 5 . The expected revenue is 0 25 . . While it is perhaps intuitive that this should be true, it still surprising that, among all possible communication protocols that the seller and buyer could engage in, the optimal one would require minimal communication and involve no randomization at all. In this light, it is even more surprising that the optimal auction would maintain its simplicity when multiple buyers are involved. When one item is sold to Fact 2 [Myerson 1981]. buyers whose values for m 2 the item are i.i.d. from a known, regular distribution F , the optimal mechanism is a second price auction with reservation price arg max { x · (1 − F ( x )) } . − ) x ( F 1 2 with density iff − regular A distribution is called f x F is monotone increasing. ( x ) f ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

3 Multi-Item Auctions Defying Intuition? · 43 There are several reasons why this fact is surprising, besides the simplicity of the optimal mechanism: , namely the auctioneer does not need access (1) The mechanism is deterministic to a random generator to implement the allocation and pricing. one round of communication (2) The mechanism requires only , namely the bidders submit bids to the auctioneer who then decides the outcome without any further exchanges with them, except to announce the outcome. , but it is optimal among (3) The mechanism is dominant strategy truthful (DST) 3 the larger class of Bayesian Incentive Compatible (BIC) mechanisms. In fact, these properties carry over to settings where bidder values are not nec- essarily i.i.d. and their distributions are not necessarily regular. Fact 3 [Myerson 1981]. When one item is sold to m buyers whose values for the item are independently drawn from known distributions, the optimal mechanism virtual welfare maximizer is a , namely: ( ) Bidders are asked to report bids for the item: b 1 ,...,b . m 1 2 ) Bids are transformed ( what are called “ironed virtual bids,” into h depends on the corresponding bid- ( b ) ) ,...,h · ( b ( ) , where each h i m 1 1 m der’s value distribution (but not on the other bidders’ distributions and not 4 even ). m 3 ) The item is allocated to the bidder with the highest ironed virtual bid, with some ( lexicographic tie-breaking. ( ) The winner of the item is charged his threshold bid, namely the smallest bid he 4 could place and still win the item. Facts 1-3 are both surprising and powerful, providing simple, yet sharp and ver- satile machinery for revenue optimization in single-item and, more broadly, single- 5 dimensional environments. Importantly, they have provided solid foundation for a tremendous literature that has brought tools from approximation algorithms and probability theory into mechanism design. Building on the shoulders of Myerson, this literature strives to understand how to make the theory robust, further improv- ing the simplicity of mechanisms and reducing their dependence on the details of the bidders’ distributions, both at a quantifiable loss in revenue. See, e.g., [Hartline 2013; Chawla and Sivan 2014; Roughgarden 2015] for recent surveys of this work. 3 A mechanism is called iff it is in the best interest of every bidder Dominant Strategy Truthful to truthfully report their value to the mechanism, regardless of what the other bidders report. Bayesian Incentive Compatible mechanisms are a broader class, but we postpone their definition, as it is slightly technical and not important for our discussion right now. See Definition 6. 4 The precise functional form of the ’s is not important for this survey. h i 5 In a single-dimensional environment, the seller can provide service to several buyers, subject to constraints on which buyers can receive service simultaneously. Each buyer has a value for receiving service, which is distributed according to some distribution known to the seller and the other buyers. Single-dimensional environments clearly generalize single-item environments, where only one buyer can be served the item. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

4 44 · C. Daskalakis AUCTIONS DEFYING INTUITION 3. Despite remarkable progress on the single-item front over the past few decades, revenue optimization in multi-item settings has remained poorly understood. We do not even have a sharp characterization of optimal two-item mechanisms, even when there is a single buyer. On the contrary, multi-item mechanisms exhibit such rich structure that it is difficult to imagine what a generalization of Myerson’s results could look like. Or, better said, the generalizations that we can imagine can be shot down via simple examples. To illustrate the richness of multi-item mechanisms, let us consider some simple multi-item settings and their corresponding optimal mechanisms. All our examples items and a single additive buyer will involve a seller with n . Such a buyer is v ,...,v ) of values for the items and derives characterized by a private vector ( 1 n ∑ p and S v ]. If utility p − p to get the items in set S ⊆ [ n , whenever he pays i ∈ S i ∑ [ v ]. p − are random his utility is E i S ∈ i 3.1 Bundling When it comes to multiple items, a natural question to ask is whether we can use Myerson’s technology to design optimal mechanisms, and indeed what exactly it is that we should be selling. The following simple example illustrates that we may need to bundle items, even when there is a priori no interaction between them. Example 2. Suppose n = 2 and the buyer’s values are i.i.d., uniform in { 1 , 2 } . Since the buyer is additive, and his values for the items are independent, getting one of the two items will not affect his marginal value for getting the other item as well. Since there is no interaction between the item values, it is natural to expect that the optimal mechanism should sell the two items separately. By Fact 1, this and letting the buyer decide which of them to would mean pricing each item at 1 2 buy. Simple calculations show that the expected revenue of this mechanism is . Interestingly, there is a flaw in this logic. While it is true that item-values do not “interact with each other,” we may still want to capitalize on the fact that the buyer’s average item-value is better concentrated than his value for a specific item. Indeed, it is better to only offer the bundle 1 , 2 } of both the items at price 3 . Given { ∑ that Pr[ . It can v 2 ≥ 3] = 3 / 4 , the expected revenue of the seller is now 9 / 4 > i i be shown that this is the optimal mechanism [Daskalakis et al. 2014]. Example 2 illustrates the following. Fact 4. Optimal multi-item mechanisms may require bundling, even when there is a single additive buyer with independent values for the items. Our intuition for the effectiveness of bundling in Example 2 appealed to the concentration of the buyer’s surplus (i.e. total value for both items). There was still a flaw in our logic, however, and this time a less tangible one: Why is the surplus the right benchmark to compare against? The following result, discussed in more detail in Section 5.8, illustrates a setting where the structure of the optimal mechanism is different in two asymptotic regimes. Consider selling n items to a buyer Theorem 1 [Daskalakis et al. 2015]. whose values for the items are i.i.d. uniform in [ c,c + 1] . The following are true: ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

5 Multi-Item Auctions Defying Intuition? · 45 1 ) n , there exists c ( such that, for all c > c For all , the optimal mechanism only 0 0 offers the bundle of all the items together at some price. For all c , there exists 2 such that, for all n > n ( , the optimal mechanism does ) n 0 0 not only offer the bundle of all items together at some price. n , the → ∞ Part 2 of the above theorem is especially counter-intuitive. As ∑ v i i , becomes more and more concentrated buyer’s average value for the items, n + 0 . 5. It is clear that the seller cannot hope to extract higher around its mean, c n ( c + 0 . 5), and offering the grand revenue than the buyer’s total expected value, n bundle for c + 0 . 5 − ), for the tiniest discount > 0, would make the buyer ( n → ∞ accept to purchase it with probability arbitrarily close to 1 as . Still, for does this intuition materialize, and it never becomes optimal to only sell the n no grand bundle ... 3.2 Randomization Recall that optimal single-item mechanisms do not require randomization. Our next example illustrates that this is not the case in multi-item settings. Suppose n = 2 , and v Example 3. is distributed uniformly in { 1 , 2 } while 1 v is independently distributed uniformly in { 1 , 3 } . In this example, an optimal 2 deterministic mechanism prices item 1 at 1 and item 2 at 3 . Its expected revenue is 2 5 . . 4 Still, we can do better. We can offer the buyer two options: The first is to pay . 5 to get a “lottery ticket” that allocates and get both items. The second is to pay 2 1 with probability 1 2 with probability 1 / 2 . The expected revenue of item and item 2 this mechanism is 625 , which can be shown to be optimal [Daskalakis et al. 2014]. . So randomization is necessary for revenue maximization. It turns out its effect on the revenue may actually be quite dramatic. Fact 5. Optimal multi-item mechanisms may require randomization. The gap between the revenue of the optimal randomized and the optimal deterministic mecha- nism can be arbitrary large, even when there are two items and a single buyer [Briest et al. 2010; Hart and Nisan 2013]. Menu Size Complexity 3.3 In our previous examples, we described the optimal mechanism as a menu of op- tions for the buyer to choose from. If the optimal mechanism were guaranteed to be deterministic, describing it as a menu would require a bounded number of op- tions, as there is a finite number of possible bundles that the mechanism may offer. Given Fact 5, however, it becomes unclear how to specify the optimal mechanism. The following example illustrates that representing it as an explicit menu may be infeasible. Example 4 [Daskalakis et al. 2013]. Suppose n = 2 , and v is distributed 1 according to the Beta distribution with parameters , 3) while v (3 is independently 2 6 (3 , 4) . Then the distributed according to the Beta distribution with parameters 6 The Beta distribution with parameters ( α,β ) is distributed in [0 , 1] according to the density 1 α − β − 1 x function ( f (1 − x ) x ) ∝ . ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

6 46 · C. Daskalakis optimal mechanism needs to offer uncountably many lotteries. The alarming feature of Example 4 is that it becomes unclear whether the alloca- tion and price rule of the optimal mechanism can be effectively described via a small 7 In Section 5.6.2, number of parameters, even if the buyer’s distribution can be. we will show that the optimal mechanism for Example 4 can actually be described via a small number of parameters. However, this may not be true in general. 3.4 Non-Monotonicity It seems intuitive that a seller with more valuable items should expect a higher revenue from selling them. One way to quantify this intuition is the following. and G such that F first-order stochastically domi- Consider two distributions F G , denoted F nates G . This means that, for all x ∈ R , F ( x ) ≤ G ( x ), i.e. F and 1 G can be coupled so that F always samples a value larger than the value sampled by G . It easily follows from Fact 1 that a seller selling an item to some buyer whose value for the item is distributed according to F makes higher revenue than if the G buyer’s value were distributed according to . Surprisingly this fails to hold in multi-item settings! There exist distributions F and G such that Fact 6 [Hart and Reny 2012]. F G but the optimal revenue from selling two items to a buyer whose values are 1 i.i.d. from F is smaller than if they were i.i.d. from G . 4. CROSSROADS It is clear from our examples in the previous section that, in the close neighborhood of Myerson’s setting, optimal mechanisms exhibit rich structure and may defy our intuition. It is not even clear if they have a finite effective representation. In view of these complications, there are several directions we may want to pursue: (1) Forget about trying to understand optimal mechanisms and pursue approxima- tions directly. Even though optimal mechanisms are complex, there may still be simple mechanisms that can be shown to guarantee some good fraction of the optimal revenue. (2) Forget about characterizing the structure of optimal mechanisms and study instead whether they can be computed efficiently. (3) Develop new machinery to characterize the structure of optimal mechanisms. Despite their apparent complexity and fragility, there may still be a different lens through which they exhibit more structure. It is a priori dubious whether the approximation approach can lead anywhere. Without understanding the optimal mechanism, how can we possibly establish the approximate optimality of some other mechanism? It is quite surprising then that this approach has actually been quite fruitful: 7 Of course, in a somewhat perverse way, the buyer’s distribution itself “indexes” the optimal mechanism for that distribution. But this does not count as it is unclear if there is an efficient procedure that can implement the mechanism given this description. We will touch upon this point a bit later, when we discuss the computation of optimal mechanisms in Section 6. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

7 Multi-Item Auctions Defying Intuition? · 47 —In multi-item settings with additive buyers, [Hart and Nisan 2012; Li and Yao 2013; Babaioff et al. 2014; Yao 2015; Rubinstein and Weinberg 2015] design approximate mechanisms by carefully decomposing the support of the buyers’ distributions into regions and, due to lack of a better benchmark, using expected welfare as an obvious upper bound to the optimal revenue, competing against this stronger benchmark in some of the regions. Here is a very interesting and clean result that they obtain for the setting of the previous section. Theorem 2 [Babaioff et al. 2014]. n items are sold to a buyer When whose values for the items are independent, a mechanism that either only prices individual items or only prices the bundle of all the items obtains at least / 6 -th 1 of the optimal revenue. 8 —In multi-item settings with unit-demand buyers, [Chawla et al. 2007; Chawla et al. 2010] upper bound the optimal revenue by the optimal revenue in a related single-dimensional setting, and define sequential posted price mechanisms for the multi-item setting competing against this stronger upper bound. These approximation results provide simple ways through which a constant frac- tion of the optimal revenue can be attained, bypassing the difficulties coming from our lack of understanding of optimal multi-item mechanisms. Moreover, they help us understand the tradeoffs between simplicity, optimality and generality of mecha- nisms. For example, Theorem 2 tells us that, if we are willing to sacrifice generality in the model and 5 6-ths of the revenue, a very simple mechanism will work for / us. Still such approximation results only apply to restricted settings, e.g. they typically assume independence of the distributions across items, and it is unclear how to extend them to broader settings: correlation among items, more complex buyer valuations, and more complex allocation constraints. Ultimately, it is our belief that a cohesive theory of optimal multi-item mech- anisms cannot be obtained through disparate approximation results applying to different multi-item environments. And even where these approximation results do apply they may still not provide fine-tuned insight into the salient features of the setting responsible for revenue. As a simple example, the non-monotonicity of revenue (Fact 6) is not foreseeable from Theorem 2 alone. On the contrary, the theorem hints towards the incorrect conclusion. In this light, over the past several years we have pursued the challenge of charac- terizing optimal multi-item mechanisms from both an algorithmic and a structural perspective. In the next two sections, we give a flavor of our progress on these fronts. In both sections, we discuss our philosophy as well as give an overview of results and techniques. We note that our goal is not the coverage of all results in the literature, but mostly the philosophy behind them. So we will focus on our philosophy and a biased sample of primarily our own results. 8 These are buyers who only want to purchase one item. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

8 48 · C. Daskalakis THE STRUCTURE OF MULTI-ITEM MECHANISMS 5. 5.1 Philosophy In Sections 3.1–3.4 we gave several multi-item examples, making claims about their optimal mechanisms. Taking a step back, how did we go about proving these claims? More broadly, how is the optimality of some object, such as a mechanism, established? Two principled approaches for doing this are the following. The first entails formulating the problem at hand as a convex minimization/concave maximization problem, and then showing that the object whose optimality we want to establish satisfies first-order conditions. The second is to develop a dual formulation and identify a solution to the dual that matches the value attained by the object whose optimality is to be established in the primal. In single-dimensional environments, Facts 1-3 can be shown using the first ap- proach. Through a chain of deductions, [Myerson 1981] expresses the expected revenue of a mechanism as an expected (virtual) welfare quantity, which can be optimized in a point-wise manner. In multi-item environments, it is not clear how to generalize Myerson’s approach, except under significant restrictions on the value distributions; see [Rochet and Stole 2003; Manelli and Vincent 2007]. This motivates the second approach, which we have pursued in the context of a single, additive buyer in [Daskalakis et al. 2013; 2015]. We give a flavor of our approach with examples in the following sections. 5.2 Setting n items and an addi- In this section, we restrict our attention to a seller with tive buyer whose values for these items are jointly distributed according to some distribution . For simplicity, we assume that F is supported on some box F ∏ n , and is continuously differentiable with bounded partial ⊂ R x , ] = X x [ i + i i derivatives. We call X the typeset of the buyer, and its elements the buyer’s possi- ble types . If a buyer of type x is allocated a subset S of the items and pays price t ∑ for it, his utility is are t x and − t . Our buyers are also risk-neutral, so if S i ∈ i S ∑ random, then their utility is [ E x − t ]. S,t i ∈ S i The seller only knows F , but the buyer knows his realized type, and the seller’s goal is to design a mechanism to optimize her expected revenue. By the revelation principle the optimal mechanism can be described in the form of a menu whose p ,...,p and entries are lotteries. Each lottery specifies a vector of probabilities 1 n t . If purchased, it will allocate each item i independently with probability a price p . Given a menu of lotteries, the buyer will choose the lottery that optimizes his i utility, given his type. If all lotteries give him negative utility, then he will not purchase any of them. 5.3 Convex Optimization Formulation To develop our duality framework we start with expressing our mechanism design problem as a convex optimization problem. Here we have a few options. (1) Represent the mechanism as a menu of lotteries for the buyer to choose from. There are a few reasons why we dislike this approach. First, the menu of the optimal mechanism may be uncountable in size, as Example 4 illustrates. So we would have to represent it as a continuous set. Second, given some representation ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

9 Multi-Item Auctions Defying Intuition? · 49 of the menu, it is cumbersome to express the expected revenue resulting from this menu, as such an expression would have to incorporate the buyer’s optimization over lotteries in the menu. Finally, each lottery in the menu is multi-parametric, allocation probabilities as well as a price. n comprising Represent the mechanism as a menu, but also keep track of which lottery in (2) the menu each possible type of buyer will purchase. In this case, a mechanism can n , allocation function X → [0 : 1] the (i) be represented as a pair of functions: P specifying the allocation probabilities of the lottery that each type will purchase, specifying the price that each type the T : X → R (ii) if any; and price function will pay for the purchased lottery, if any. Now, the representation of the mechanism is simpler, and we can easily express its expected revenue as follows: ∫ T ( (1) ) dF ( x ) . x X Of course, we need to add some consistency constraints to make sure that our modeling is faithful: ′ ′ ′ X : x ·P ( x ) −T ( x,x ∀ ≥ x ·P ( x x ) −T ( x ∈ ); (2) ) ∀ x ∈ X : x ·P ( x ) −T ( x ) ≥ 0 . (3) Constraint (2) expresses that no type prefers a different lottery to the one we maintain for this type, while Constraint (3) expresses that each type will actually buy this lottery. Finding the pair of functions ( T ) optimizing (1) subject to the constraints (2) P , and (3) was the approach taken by [Myerson 1981] and is quite standard. In the multi-item setting, however, this representation is still cumbersome to work with as, besides having an P , T ) also has an -variate input, the compound function ( n + 1)-dimensional output. ( n (3) Given the complexity of the standard representation, we decide to optimize over mechanisms indirectly. Rather than optimizing revenue in terms of the mechanism’s allocation and price functions, we want to explore whether we can optimize revenue in terms of the buyer’s from participating in the mechanism. utility Indeed, facing some mechanism, a buyer of type will decide to buy some lottery x 9 x ,t So, every mechanism ), thereby enjoying utility p ( · p − t from his decision. x x x x induces a function u : X → R , where u ( x ) expresses the utility of the buyer when his realized type is x and he buys his favorite lottery in the mechanism, if any. Our u goal is to optimize over mechanisms indirectly by optimizing over ’s, which raises two questions: u : X (1) Given a function R , can we recognize whether there is a mechanism → inducing this function u as the utility of the buyer? (2) If u is induced by a mechanism, is there enough information in u to uniquely specify the expected revenue of whatever mechanism induces , and can we get u our hands on a mechanism that induces u ? 9 ~ Let us assume that the lottery ( 0 , 0) is always in our menu to account for the possibility that the buyer may derive negative utility from all lotteries in the menu and hence decide to buy none. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

10 50 · C. Daskalakis u is informative enough about the mechanism(s) A priori it is unclear whether that induce(s) it. In fact, it appears that it is losing information about these mechanisms. Nevertheless, the answer to each of the two questions above is actually “yes,” due to the following theorem by Rochet. : X Function R is induced by a mechanism u Theorem 3 [Rochet 1987]. → is 1 -Lipschitz continuous with respect to the iff u -norm, non-decreasing, convex ` 1 u ( x ) ∇ and non-negative. Moreover, if these conditions are met then exists almost X , and wherever it exists: everywhere in ∇ u ( x ) — x , and are the allocation probabilities of the lottery purchased by type — u ( x ) · x − u ( x ) is the price of the lottery purchased by type x ∇ u . in any mechanism inducing The theorem follows by combining Constraints (2) and (3). For a concise deriva- tion of the theorem, please refer to Lecture 21 of [Daskalakis 2015]. In order to ground it to our experience, let us plot the utility induced by the optimal mecha- nism in Example 1. The utility is shown in Figure 1. We can verify that it satisfies the conditions of the theorem, and its derivatives contain information about the allocation function of the mechanism. u 1 / 2 1 v 0 1 / 2 Fig. 1. The utility induced by a take-it-or-leave it offer of an item at 0 . 5. With Theorem 3, we can formulate our mechanism design problem as follows: ∫ ( ∇ u ( x ) · x sup u ( x )) dF ( x ) − X X u ( x ) − u ( y ) |≤| x − y | s.t. , ∀ x,y ∈ | 1 u : non-decreasing : convex u ( x ) ≥ 0 u ∀ x ∈ X. , Note that our formulation aims at optimizing the expected price paid by the buyer, which is given by ∇ u ( x ) · x − u ( x ) when his type is x , subject to the constraints ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

11 Multi-Item Auctions Defying Intuition? · 51 u . Since u ( x ) may only be undefined at a measure zero subset of on the function ∇ F has no atoms), we can take “ u ( x )” to mean anything when it X (given that ∇ is undefined, and this will not affect our revenue. For concreteness, from now on, at x ), we will mean any subgradient of u ∇ x (which will exist whenever we write u ( is convex). as u u is a scalar function The advantage of our new formulation is that the variable of the buyer’s type. The downside is that the objective function is cumbersome and the constraints, especially convexity, are difficult to handle. We can eliminate the first issue with a little massaging of the objective, using the divergence theorem. f denotes the density of Our objective can equivalently be written as follows, where η ( x ) denotes the outer unit normal vector at point F ∈ ∂X of the boundary and ˆ x X —see Lecture 21 of [Daskalakis 2015] for a concise derivation. of ∫ ( ∇ u ( x ) · x (4) u ( x )) dF ( x ) ≡ − X ∫ ∫ f x ) f ( x )( x · ˆ η ( x )) dx − ∇ (5) u ( x )( ( u ( x ) · x + ( n + 1) f ( x )) dx. X ∂X u . Indeed, it can be viewed as Our massaged objective is linear in our variable the “expectation” of u μ with the following with respect to the signed measure 10 density: f ( x )( x · ˆ η ( x )) 1 . )) x − ( ∇ f ( x ) · x + ( n + 1) f ( ∂X ∈ x Equipped with the above definition, we can express our mechanism design problem as the following convex optimization problem. ∫ x u ( sup ) dμ ( x ) X , (6) s.t. | u ( x ) − u ( y ) |≤| x − y | ∈ X ∀ x,y 1 ) : P ( : non-decreasing u (7) u : convex (8) ( x ) ≥ 0 , ∀ x u X. (9) ∈ Balancing . For u ( μ ) = 1, integral 4 becomes − 1. Hence, so does integral 5, x ∫ X balanced, dμ = − 1. So μ ( and therefore ) = − 1. It is convenient to have μ X namely satisfy ( X ) = 0. So we add an atom of +1 at point x = ( x μ ,...,x ) to n 1 make this happen. Accordingly, from now on, μ is in fact the measure with the following density: (10) . )) x + f ( x )( x · ˆ η ( x )) 1 ( f + 1) − ( ∇ f ( x ) · x + ( n 1 ∂X x ∈ x = x 10 μ through its density. It is more accurate to define μ To be formal here, we should not define using Riesz’s representation theorem, as the unique Radon measure such that the bounded linear ∫ u equals functional (5) of bounded continuous functions udμ . Accordingly, all measures in our X duality framework will be Radon measures. Having said that, we invite the reader to forget about Radon measures and Riesz’s theorem for the remainder of this survey. Formally, whenever we talk about measures we mean Radon measures, and μ is as defined by Riesz’s representation theorem. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

12 52 · C. Daskalakis Interpretation of Convex Formulation 5.4 Figure 2 shows for the single-item setting of Example 1, while Figure 3 shows μ μ for the setting of Example 4. As measure μ is now two-dimensional we have ~ , where \{ X } from X color-coded it. The dashed curve separates X X and 0 + + − − X where μ is positive and negative respectively. Notice the atom of are subsets of ~ +1 measure at point 0 in both figures. u +1 v 0 1 − 2 in Example 4. The dashed μ Fig. 3. Measure ~ 0 X X \{ curve separates } from . In par- + − μ in Example 1. Fig. 2. Measure ~ X is an increasing (ie upper- \{ ticular, 0 } + closed) subset of . X ∫ P Formulation ( ) is asking us to maximize . So we would like to make udμ X as large as possible where μ is positive and make it as small as possible where u is negative. However, Constraints 6–9 impose interesting tradeoffs on how we μ u should implement this optimally, as should be continuous, not too steep, and convex-nondecreasing. For instance, the optimal u against the measure of Figure 2 turns out to be the one shown in Figure 1. How do we know this? It is not completely obvious, but we know it should be true from Fact 1/Example 1. In the multi-item setting, what we need is new machinery that will allow us to identify optimal such tradeoffs. As we have already discussed, however, we know of no direct way to find these tradeoffs by studying formulation ( P ) directly. Our approach is instead to develop a dual optimization problem that will hopefully provide insight about the optimal solution to ( ). P Duality: Building Intuition 5.5 It is a priori not clear whether ( ) has a strong dual problem, i.e. a minimization P problem whose optimal value matches that of the optimal value of ( P ). The reason is that it is an infinite-dimensional program, and we are not aware of infinite- dimensional programming tools that can handle our constraints [Luenberger 1968; Anderson and Nash 1987]. We will show that a strong dual actually does exist in Section 5.7. Prior to that let us build some intuition though, grounding it to our experience in the finite-dimensional world. In this section, we will make an analogy to min-cost perfect matching duality that will lead us to formulate a weak dual of ( P ), upper-bounding but not necessarily matching the value of ( P ). But first let us ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

13 Multi-Item Auctions Defying Intuition? · 53 P ) to the following equivalent formulation: massage ( ∫ dμ ) u ( x ) sup ( x X P ( ) : x ) − u ( y ) ≤| ( x − y ) ∀ | (11) , s.t. u x,y ( X ∈ 1 + (12) u : convex ~ ( 0) = 0 (13) u ∑ x − y ) where we denote by | ). Notice that Constraint (11) = ( y max(0 ,x − | i i 1 + i u must be non-decreasing as well as 1-Lipschitz with respect already implies that to the ` -norm. Combined with (13), it also implies non-negativity. So, if any- 1 thing, our new constraints might have restricted the set of functions that we are optimizing over. But, it is also easy to see that Constraints (6) and (7) imply Constraint (11). Moreover, Constraint (13) could have been added to the original formulation without changing the optimal value, given that ( X ) = 0. So the two μ formulations are actually equivalent. as the difference μ − Next, let us write μ μ of two non-negative measures μ + − + and μ , so that − ∫ ∫ ∫ = udμ − udμ udμ + − X X X ). u μ ( X ) = 0, it follows that μ . Given that ( X for all measurable μ X ( ) = − + so that X and X X Moreover, let X is supported on μ be a partition of + + − + and μ . With this notation, let us consider the following is supported on X − − P ): relaxation of ( ∫ ∫ − udμ udμ sup + − X X ′ ) : ( P ( x ) − u ( y ) ≤| ( x − y ) | s.t. u X , ∀ x ∈ X ∈ ,y + + − 1 ~ ( 0) = 0 u where we have dropped the convexity constraint, and only maintain Constraint 11 x ∈ X y and for ∈ X ) is also a feasible . It is clear that any feasible solution to ( P − + ′ ′ ) upper bounds the optimum ) and, therefore, the optimum of ( P solution to ( P of ( P ). We are thus ready to employ our finite-dimensional intuition. Suppose temporar- were uniform over X and X X were finite, and measures μ ily that sets ,μ + − + + − ′ and respectively. In this case ( P ) becomes a problem of assigning potential X − values to the nodes of the complete bipartite graph ( X ) with the ,X X ,X × + − + − X and goal of optimizing the total potential (gaining the potential of nodes in + X X losing the potential of nodes in x ∈ ) subject to the constraint that for every − + and y ∈ X . It the difference in potential between x and y cannot exceed | ( x − y ) | + 1 − is well-known that the dual of this problem is a min-cost perfect matching problem on the same graph, where the weights are as in Figure 4. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

14 54 C. Daskalakis · x | + ) y − x ( | y X X − + Fig. 4. Min-Cost Perfect Matching. Going back to the infinite-dimensional problem, a perfect matching should cor- μ μ , defined formally as follows. respond to a coupling of the measures and + − coupling between two non-negative measures Definition 1. A and μ defined μ 2 1 ⊆ ) = and satisfying μ S ( S over some μ X ( S ) is a non-negative measure γ over 1 2 ′ S such that, for all measurable subsets × ⊆ S , it holds that: S S ∫ ∫ ′ ′ dγ = μ S ( S . ) μ ( ( S ) and ) μ ) S ( dγ = μ 1 2 2 1 ′ ′ S S × S × S μ We denote by ,μ ) the set of all couplings between μ and μ . Γ( 1 2 1 2 Given the above definition and our finite-dimensional intuition, it makes sense to ′ P propose the following as a dual to ( ): ∫ x | dγ ) x,y | ( inf − y ) ( + 1 ′ ) : ( D X × X s.t. γ ∈ Γ( μ ,μ ) + − ′ D ) upper- It is indeed quite straightforward to establish that the optimum of ( ′ bounds the optimum of ( P ). ′ ′ ( D is a weak dual of ( P ) ) . Lemma 1 [Daskalakis et al. 2013]. ′ ′ u of ( P of ( ) and γ Proof (Lemma 1): D For any feasible solution ) we have the following: ∫ ∫ ∫ = y ) ud ( μ x,y − μ ( ) = (14) dγ )) udμ ( u ( x ) − u ( − + X X X X × ∫ ( (15) , ) | ≤ x − y ) x,y | dγ ( + X × X ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

15 Multi-Item Auctions Defying Intuition? · 55 ′ γ in ( ) and the inequality where the second equality follows from the feasibility of D ′ in ( ). follows from the feasibility of P u 5.6 Applications of Weak Duality So far, we have formulated revenue optimization as a convex optimization problem ′ ′ ′ P ) of ( D P ) of ( ) and identified a weak dual ( ), defined a relaxation ( ) and P ( P ). Namely, here is where we stand: therefore of ( P ′ ′ ) P ) ) ≤ OPT( D OPT( ≤ . OPT( P ′ D ) is only a weak dual of ( P ), it is unclear how to use it to certify Given that ( u is an optimal solution to ( P optimality of mechanisms. Indeed, it could be that ), ′ of ( D but there is no solution ) that matches it in value of the objective. So what γ ′ D is the point of ( )? ′ ) and ( ) motivates the development P The mismatch between the values of ( D of a strong dual of ( P ) in the next section. Nevertheless, our experience shows ′ ) often gives a simple heuristic that can be used to certify optimality of D that ( ′ mechanisms. In the remainder of this section, we illustrate how ( D ) can be applied for this purpose. u is the utility function of some mechanism that we want to show is Suppose that F optimal for some distribution u is an optimal solution . So we want to show that P μ is the signed measure derived from F according to (10). Even to ( ), where ′ though ( D P ) we could still be optimistic, trying to find a ) is only a weak dual of ( ′ feasible solution γ of ( D ) that achieves the same objective value as that achieved by u in ( P ). What does γ need to satisfy for this to happen? Inspecting the proof of Lemma 1 ′ D (which also applies to feasible solutions of ( )) we get the following ) and ( P sufficient condition: 11 x,y )-almost everywhere: u ( x ) − u ( y ) = | ( x − y ) γ | . ( + This condition would force the only inequality in the proof to be tight. Given that u , wherever defined, corresponds to allocation probabilities, the the gradient of following condition is also sufficient: ( x,y )-almost everywhere: γ x,y all types on segment ( ) i with proba- receive item ⇒ = x > y i i bility 1 ( ) ∇ exists on all points u i : ∧ ∀ of the segment ( ) x,y ) x,y all types on segment ( with proba- i receive item x = ⇒ < y i i bility 0 (16) Equipped with sufficient condition (16), we exhibit how it can be used to show optimality of mechanisms. We start with a simple example in Section 5.6.1, pro- 11 What this means is that the set of points ( x,y ) ∈ X × X where the equality fails to hold has measure 0 under γ . ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

16 56 · C. Daskalakis ceeding to a more complex example in Section 5.6.2. It will be instructive to think γ as routing infinitesimal amounts of flow between pairs of points of our couplings in X , and Condition (16) as restricting the directions of these flows. 5.6.1 Certifying Optimality of Mechanisms Using Duality Example 5. Suppose 2 items are sold to a buyer whose values for the items are 1] . [0 i.i.d., uniform , The optimal mechanism in Exam- Claim 1 [Manelli and Vincent 2006]. √ 2 4 − . 3 / 2 ple 5 is to price each individual item at , and the bundle of both items at 3 μ First, let us compute the measure Proof (Claim 1): induced by the uniform 2 , 1] distribution over according to (10). The resulting measure comprises a X = [0 2 − 1] 3 surface measure distributed uniformly over [0 , a total of +2 single- total of , 2 1] , dimensional measure distributed uniformly over the top and the right edge of [0 , and an atom of +1 at the origin. 2 , R 1] ,R into four regions ,R corresponding ,R In Figure 5, we partition [0 3 0 2 1 to the subsets of types that will purchase each lottery in the menu, or no lottery at all. (The tie-breaking at the boundaries between regions is unimportant as it μ corresponds to a measure 0 set of types.) We also depict and write down the utility function restricted to each region. +1 uniformly distributed u R ( x 3 ) = : { R x 2 +1 uniformly distributed 1 2 } : − , { u 1 2 3 ( , x 2 3 2 ) = } 2 , 4 − √ x 3 1 3 2 + x 2 − 4 − √ 3 2 R 0 : u ∅ , ( 0 x ) = 0 √ 2 2 − 3 constant surface density 3 2 − : R { } 1 , 2 3 2 ) = x u − x ( 1 3 √ 2 − 2 2 +1 3 3 μ for two i.i.d. uniform [0 , Fig. 5. Measure 1] items, and the partition of the typeset induced by √ 4 − 2 / 3 and the bundle of both items at the mechanism that prices each individual item at 2 . 3 We also depict the utility of the buyer depending on which region his type falls into. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

17 Multi-Item Auctions Defying Intuition? · 57 R In particular, region corresponds to the types that do not want to purchase 0 1 total surface measure uniformly distributed, anything. In this region, there is − R corresponds to the as well as an atom of +1 total measure at the origin. Region 1 √ √ 2 2 4 − 2+2 { 2 1 . There is a total of − at price } , types that will purchase bundle 3 3 √ 2+2 2 surface measure uniformly distributed here, as well as a total of + single- 3 dimensional measure spread uniformly on the top and right edges of this region. R correspond to types that will purchase items 1 and 2 and R Finally, regions 2 3 √ 2 2 − surface measure / respectively at price 2 3. Each of these have a total of − 3 √ − 2 2 uniformly distributed, as well as a + single-dimensional measure uniformly 3 distributed on their right and, respectively, top edges. γ Our goal is to find a coupling μ satisfying Condition (16). and μ between − + ∇ , we should Given that our sufficient condition is sensitive about the existence of u μ consider couplings of and μ in each region separately. This is conceivable as − + μ is balanced in each region, namely μ 3. In fact, ( R , ) = μ 2 ( R , ), for all i = 0 , 1 i − + i u in each region and the form our path is cut out for us given the functional form of μ so of Condition (16). In particular, we are seeking a coupling between and μ + − that —In region we are only allowed to transport measure in north-east directions, R 0 as both items are allocated with probability 0. —In region R we are only allowed to transport measure in south-west directions, 1 as both items are allocated with probability 1. —In region R we are only allowed to transport measure in north-west directions, 2 as item 1 is allocated with probability 1 and item 2 is allocated with probability 0. R we are only allowed to transport measure in south-east directions, as —In region 3 item 2 is allocated with probability 1 and item 1 is allocated with probability 0. In fact, we will be more optimistic, restricting our transports to be westward and R and R respectively. All in all, we are seeking a coupling of southward in regions 2 3 μ that pushes measure in the directions shown in Figure 6 in each region. and μ − + μ are set up, Now it is easy to verify that the way our regions and measure μ where all transports take place according to and μ it is possible to couple + − the figure. So Condition (16) is satisfied, and the resulting coupling certifies the optimality of u . We refer the reader to [Giannakopoulos and Koutsoupias 2014] for an application of the afore-described approach to n = 3 ,..., 6 i.i.d. uniform [0 , 1] items. While written in a slightly different language, their proof establishes the existence of ′ D ) matching the value achieved by the optimal mechanism in ( P ). It solutions to ( still remains an interesting open problem to determine the optimal mechanism for ′ 6. We conjecture that using ( D n > ) remains sufficient, but it becomes analytically challenging to define the transports in high dimensions. 5.6.2 Reverse-Engineering Optimal Mechanisms Using Duality. In Sec- tion 5.6.1, we started with a conjectured optimal mechanism. Given the mech- anism, we partitioned the typeset into regions, depending on what lottery each type will purchase. We then used Condition 16 to guide us with what directions we ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

18 58 · C. Daskalakis γ Fig. 6. The directions of measure transport used in our coupling for Example 5. should use to transport measure in our coupling, in each region separately, as de- termined by the gradient of the utility function induced by the conjectured optimal mechanism. In fact, we can reverse-engineer these steps when we do not have a conjecture about the optimal mechanism. Let us go back to Example 4, where we had two in- , dependent Beta items with parameters (3 , 4). Also, recall Figure 3 where 3) and (3 we show the measure μ induced by the product of these distributions, according to (10). How might the optimal mechanism for this example look like? It is reasonable to try to find a mechanism with the following properties: —For sufficiently small values of v the mechanism and sufficiently large values of v 2 1 offers item 2 with probability 1 and item 1 with probability strictly smaller than 1. Let us denote the unknown subset of the typeset where this may happen. A v —For sufficiently small values of and sufficiently large values of v the mechanism 1 2 offers item 1 with probability 1 and item 2 with probability strictly smaller than 1. Let us denote the unknown subset of the typeset where this may happen. B Now let us revisit Condition (16). If we were to use this condition to show optimality of our yet-unknown mechanism, we would certainly be allowed to: (1) transport measure southward in region A ; and (2) transport measure westward in region B . ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

19 Multi-Item Auctions Defying Intuition? · 59 1 ′ v S right S top s v A . 8 0 A W . 0 6 v 0 . 4 2 0 . Z h B ′ h h B 0 8 1 0 0 . 4 0 . 2 6 0 . 0 . μ Fig. 7. Reverse-engineering the optimal mechanism from . , if item 1 is allocated with A We may also be able to push measure eastward in region probability 0 in this region. Similarly, we may be able to push measure northward B , if item 2 is allocated with probability 0 in this region. As we want to be in region versatile, let us ignore these extra possibilities, insisting on southward transports in region and westward transports in region B . A With these restrictions on our transports let revisit measure , trying to close in μ X on subsets of and B may occupy. In Figure 7, we have drawn a that regions A S that traces, for each x , the top-most point monotone strictly concave curve 1 top x ) and ,x ( ) such that, restricted to the vertical segment between points ( x ,x 1 2 1 2 x denotes first-order stochastic dominance between , 1), μ ( , where μ − + 1 1 1 measures defined as follows. Definition 2. μ X ,μ If are two non-negative measures defined on some S ⊆ 1 2 such that μ , ( S ) = μ μ ( S ) , we say that μ first-order stochastically dominates 1 2 2 1 ,μ μ denoted μ μ , iff there exists a coupling γ ∈ Γ( μ and μ ) between 2 1 1 2 2 1 1 γ ( x,y ) , x is coordinate-wise larger such that, almost everywhere with respect to y than or equal to , . Equivalently, for all non-decreasing measurable functions u ∫ ∫ ≥ udμ . udμ 1 2 S S ~ \{ , if it 0 } is an increasing set, point S ( x ) must belong to X X Given that top − + 1 exists, and it can be identified by finding the largest x such that the total measure 2 x ,x x ) and ( on the segment between points ( is 0. Notice that for , 1) under μ 1 2 1 x some ’s there fails to be a segment with these properties, so S is undefined for top 1 those x ’s. Similarly, the monotone strictly concave curve S traces, for each x , 1 right 2 the rightmost point ( ,x ) such that, restricted on the horizontal segment between x 1 2 points ( x , ,x x ) and (1 ,x . Again, this curve is not defined for all ), μ μ 1 1 − 2 2 2 + and it lies within X . − ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

20 60 · C. Daskalakis To accommodate Constraints (1) and (2) on our transports of measure in the (still unidentified) regions B , it suffices to pick A to be any region defined A and 2 ′ vv and a vertical segment v by = ( 1] , ) , the left and top boundaries of [0 S A top 2 ′ S and on the top edge of [0 , 1] v , as in the figure. Similarly, we v for some ∈ top S can pick , the bottom and right boundaries B to be any region defined by right ′ 2 ′ h of [0 = ( hh , 1] h ∈ S ), for some and h and a horizontal segment on the right B 2 right edge of [0 . For any region A as above, by the definition of S 1] , we can , top μ restricted in A with μ while respecting Constraint (1), restricted in A couple + − as we can do this separately for every “vertical slice” of B as . Similarly for any A restricted in B μ with μ above, we can couple restricted in B while respecting − + Constraint (2). v , and what to do with the and The next question is how to pick segments h B A X \A∪B rest, , of the typeset. A natural approach is to assume that the remainder of and the typeset is partitioned into two regions, Z W , of types that will purchase the , p } of both items at some price 1 bundle { 2 and of types that will not purchase } 2 1 { , anything. Clearly, the utility in u ( x ) = x W + x while − p will be of the form 2 1 { 1 , 2 } Z will be u ( x ) = 0. Hence, the boundary between these regions must the utility in ◦ be a 45 segment. Given this, a natural way to finish would be to try to identify a point S v ∈ top h S such that: and ∈ right ◦ (i) The straight segment, ( and h has a 45 vh angle; more ), between points v h − precisely, we want that ( v 1). ,h − − v , ) ∝ (1 2 1 1 2 Z , under the curve (ii) The region, defined by the initial part of S (between S top points S S (0) and v ), the straight segment ( vh ), and the initial part of top right (between points and (0)) has total measure μ ( Z ) = 0 and is convex; S h right ′ ′ ~ ~ v (iii) The region, W 1), ( , enclosed by the straight segments 1 h v ), h and , ( B A ), satisfies μ ( | denote respectively the | μ μ | and vh μ | , where + W W − − 1 W W + μ restrictions of measures and μ . in region W − + and h satisfying Requirements (i)–(iii), we Note that, if we can identify points v x, : u ` are done. Indeed, let us define the function ( x Z ), mapping each type x 7→ 1 ` . Clearly: distance from set Z to its 1 —From (ii), is convex, hence u ( x ) is also convex. Given that Z is a decreasing Z u is non-decreasing. It is also clearly non-negative and 1-Lipschitz. So set, is feasible for ( P ). If we could also find a coupling γ between μ u and μ + − that satisfies Condition (16), this would establish that u P ). We is optimal for ( μ μ restricted to each region. proceed to do this next separately for and − + x Z u Region x ) = 0, for all : ∈ Z , conforming to our intention that Z is the — ( ∇ set of types that do not purchase anything. Given that ( x ) = 0, for all x ∈Z , u Condition (16) implies that we are allowed to transport measure in north-east ~ Z X only ∪{ directions in this region. Given that lies entirely within } , μ 0 + − resides at (0 , 0). Moreover, μ ( Z ) = 0. Hence, it is possible to couple μ μ and + − in Z with only north-east transports. ∗ ∗ W : u ( x ) = x is the intercept of + x , for all − p — Region x ∈ W , where p 2 1 segment ( vh ) if it were extended to hit the x axis. This conforms to our intention 1 that the types in W purchase the grand bundle. Given that ∇ u ( x ) = 1, for all ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

21 Multi-Item Auctions Defying Intuition? · 61 ∈ W x , Condition (16) implies that we can transport measure in south-west | | directions in this region. Given that μ μ , as per Requirement (iii), W W − 1 + μ and we can indeed couple μ in this region with only transports in south-west − + directions, by Definition 2. A is convex (and therefore the lower boundary of region Given that Z — : Region ◦ as per the location of this boundary with respect to the is less steep that 45 A vh )), for all x ∈A , u ( x ) is the vertical distance between x and S . So segment ( top ∂u = 1, which conforms to our intention that item 2 should be allocated with ∂x 2 probability 1 in this region. Moreover, this means by Condition (16) that we are μ and μ in allowed to transport measure southward in our coupling between − + this region. And, by the definition of S μ μ and , restricted to this region + − top can be coupled with only southward transports. Region B — Similarly, given that Z is convex (and therefore the left boundary : ◦ is steeper that 45 of region ), u ( x ) is the horizontal distance between x B and S for all x ∈ B . Via similar arguments as those employed for region A , μ + right and μ can be coupled in this region with only westward transports, respecting − Condition (16). and h satisfying Requirements By the above discussion, if we can identify points v (i)–(iii), this means that we have identified our optimal mechanism. It turns out that, for our specific distributions, Requirements (i)—(iii) can be satisfied and we can analytically compute the points h , as shown in Figure 7. So we have v and managed to reverse-engineer the optimal mechanism for our setting by exploiting Z . In Condition (16). The mechanism can be described indirectly by specifying Z , the utility function of the optimal mechanism is u terms of x 7→ ` ). ( x, Z : 1 u From we can also find the lotteries offered by the optimal mechanism: The types W receive both items with probability 1. Each type x in receives item 2 ∈ A with probability 1 and item 1 with probability that equals minus the slope of S top at point S receives item 1 with probability 1 ( x ∈ B ). Similarly, each type x 1 top S at point and item 2 with probability that equals minus the inverse slope of right ). ( x S 2 right Some remarks are in order before we conclude this section: —First, as a byproduct of our derivation above we have proven our claim in Exam- ple 4 that the optimal mechanism offers an uncountably large menu of lotteries. Indeed, recall that S is strictly concave. Hence, the allocation probability of top item 1 differs in every vertical strip within this region. Thus, a continuum of lotteries are offered to the types in A . The same is true for the types in B . —On the other hand, as promised in Section 3.3, there does exist a succinct descrip- tion of the optimal mechanism. All we need to maintain is an analytic description of the boundary of region Z . —We emphasize again that the approach followed in this section to reverse-engineer the optimal mechanism is not guaranteed to succeed. Indeed, it is based on a ′ D ) of our optimal mechanism design formulation ( weak dual ( P ). Based on this weak dual, it identifies a complementary slackness condition, (16), which ties ′ solutions to ( P ) and ( D ) in a particular way. It then makes guesses about the ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

22 62 · C. Daskalakis optimal mechanism, and tries to follow through with these guesses using Condi- tion (16). Despite the fact that it is not guaranteed to succeed, the approach is quite successful in identifying optimal mechanisms. In [Tzamos 2015], Christos Tzamos provides an applet where the heuristic approach of this section is applied to reverse-engineer the optimal mechanism for user-specified Beta distributions. —Given that the heuristic method proposed in this section may not succeed, it remains important to develop a technique that is guaranteed to succeed. This is what we do in the next section. 5.7 A Strong Duality Theorem ′ P ), of opti- ) of our formulation, ( In previous sections, we identified a weak dual ( D mal mechanism design as a convex optimization problem. Based on the weak dual, in Section 5.6.1, we gave an example where we were able to certify the optimality of a mechanism. In Section 5.6.2, we also showed how we can use the weak dual ′ ) is a weak dual, so there are to reverse-engineer the optimal mechanism. Still, ( D settings where both certifying and reverse-engineering the optimal mechanism will fail [Daskalakis et al. 2015]. We would thus like to obtain a strong dual of formulation ( ), whose optimum P is guaranteed to match that of ( P ). The advantage of a strong dual would be n,F ), the optimal mechanism u for that, for every mechanism design problem ( P ( ) would have a certificate of optimality in the form of a solution to the dual. Thus, we would know what type of certificate to look for, and we would also be able to reverse-engineer optimal mechanisms by exploiting the structure of those certificates. The challenge we encounter though is that a priori it is not clear if a tight dual to ) should exist. Indeed, ( P ) is an infinite-dimensional optimization problem and ( P , such as convexity, which are non-standard. In u contains constraints on the variable particular, we are not aware of a duality framework, based on infinite-dimensional linear programming, that can accommodate such constraints directly. We prove our own extension of Monge-Kantorovich duality to accommodate the constraints P ), establishing the following. of ( Formulation P ) has a strong dual ( Theorem 4 [Daskalakis et al. 2015]. ( D ) , taking the form of an optimal transport problem, as follows: formulation ∫ ∫ ) : sup ( inf = udμ P | ) D : ( dγ | ) y − x ( 1 + ′ ′ ,μ γ ) ∈ Γ( μ X X X × − + u satisfies (11) , ′ ′ ′ μ μ − μ = μ cvx + − (12) and (13) Moreover, both and inf are attainable. sup We next explain the statement of Theorem 4, comparing ( D ) to the weak-dual ′ D ) of Section 5.5. We then discuss the point of deriving a strong dual, before ( turning to some important applications of our strong duality theorem. 5.7.1 Discussion of Theorem 4. First, to clarify the notation used in Theorem 4, is the standard notion of convex dominance between measures, defined as cvx follows. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

23 Multi-Item Auctions Defying Intuition? · 63 μ μ and over X we say that μ For two measures convexly Definition 3. 1 2 1 dominates μ μ , denoted μ , iff, for all measurable, non-decreasing and 2 cvx 1 2 ∫ ∫ convex functions u , ≥ udμ . udμ 2 1 X X Compared to the first-order stochastic dominance of Definition 2, convex dom- ∫ ∫ udμ ≥ udμ inance is a weaker requirement as it only requires that for 1 2 X X u u ’s. convex, non-decreasing ’s, as opposed to all non-decreasing As non-decreasing, convex functions model utility functions of risk-seeking indi- μ viduals, when ,μ , this means that any are probability measures and μ μ cvx 1 2 1 2 risk-seeking individual prefers a prize distributed according to μ to a prize dis- 1 . More generally, when tributed according to μ μ , this essentially means μ 2 cvx 2 1 μ through a sequence of operations allowed to: can be obtained from μ that 2 1 —send (positive) measure to coordinate-wise larger points—this makes the integral ∫ ∫ since udμ larger than udμ u is non-decreasing; or 1 2 —spread (positive) measure so that the mean is preserved—this makes the integral ∫ ∫ larger than udμ since u is convex. udμ 1 2 Now that our understanding of convex dominance is grounded, let us compare our ′ D ) to the weak-dual ( D strong dual ( ) of Section 5.5. The two problems are very similar. They are both min-cost transportation problems, seeking a coupling γ x,y ) ( between two non-negative measures under which the total cost under the same cost function x − y ) | | ( is minimized. The difference between the two problems is + 1 ′ and μ that ( ) seeks a coupling between μ , the positive and negative parts of D − + μ derived from F according to (10). ( measure ) is also allowed to pre-process D μ without incurring any cost before seeking a coupling between the positive and D ) is allowed to choose any negative parts of the processed measure. In particular, ( ′ ′ measure μ and couple the positive and negative parts of that measure μ μ . cvx This way it may reduce the transportation cost, which Theorem 4 says is guaranteed to match the optimum of ( P ). The Point of Theorem 4. So, what is the point of obtaining a strong dual 5.7.2 of our mechanism design problem? We have already alluded to the benefits of strong duality above. By analogy, these should also be clear to anyone familiar with the uses of strong linear programming duality in combinatorial optimization. Let us ′ expand a bit. In comparison to our weak dual ( ), our strong dual ( D ) is more D any powerful as it allows us to certify the optimality of mechanism. In particular, for any mechanism design problem defined by some n and F , we are guaranteed to ′ ′ ′ whose transportation and μ μ μ and a coupling γ between find a measure μ cvx − + cost equals the optimal revenue. In particular, we can identify conditions tying a ′ u to ( P ) with a solution ( μ solution ,γ ) to ( D ) that, whenever satisfied, establish ′ the joint optimality of both and ( μ ,γ ). Additionally, using these conditions as u a guiding principle, we can reverse-engineer optimal mechanisms in settings where we do not have a conjecture about what the optimal mechanism is, as we did in Section 5.6.2 using weak duality. Except now this approach is guaranteed to work. We refer the interested reader to [Daskalakis et al. 2015] for these conditions, as well as examples where they are used to certify optimality of mechanisms. The approach is similar to Sections 5.6.1 and 5.6.2, so we do not expand more here. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

24 64 · C. Daskalakis We prefer to turn instead to exciting applications of the strong duality theorem to obtain characterization results. 5.8 Characterizing Optimal Mechanisms Using our strong duality theorem (Theorem 4) to certify optimality of mechanisms may be cumbersome. A pertinent question is this: Can we somehow use the the- orem behind the scenes to develop simple characterization results of mechanism optimality? In this section, we obtain such a characterization: given a proposed mechanism for some setting of , we identify a collection of necessary and n and F for the mechanism to be optimal. In particular, these μ conditions on sufficient conditions do not involve the dual problem at all. They are just conditions on derived from the measure and F using (10). We proceed to give a flavor of μ n our characterization, and some applications. We start with a characterization of grand-bundling optimality, proceeding to general mechanisms afterwards. 5.8.1 Characterization of Grand-Bundling Optimality. Let us complement Def- inition 3 with the following definition. For two measures μ second- and μ Definition 4. over X we say that μ 2 1 1 μ μ μ , iff for all measurable, non- order stochastically dominates , denoted 2 2 2 1 ∫ ∫ , . udμ ≥ decreasing and concave functions udμ u 1 2 X X As concave and non-decreasing functions model utility functions of risk-averse in- dividuals, if , this means that any ,μ μ are probability measures and μ μ 1 2 2 1 2 risk-averse individual prefers a prize distributed according to μ to a prize dis- 1 tributed according to μ can . More generally, μ μ essentially means that μ 2 2 1 2 1 μ be obtained from via a sequence of operations that shift positive measure to 2 coordinate-wise smaller points and do mean-preserving merges of positive measure. With the above definition, we can use our duality theorem behind the scenes to obtain the following characterization of grand-bundling optimality. Theorem 5 [Daskalakis et al. 2015]. n and F , the mechanism that For all p is optimal if and only if measure only offers the bundle of all items at some price defined by satisfies μ | is the subset of types that W 0 μ , where μ | (10) Z cvx 2 W can afford the grand bundle at price p , Z the subset of types who cannot, and μ | , W μ | are respectively the restrictions of μ in subsets W and Z . Z Sufficient conditions for grand-bundling optimality have been an active line of inquiry—see e.g. [Manelli and Vincent 2006; Pavlov 2011]. Theorem 5 provides a single condition, in the form of two stochastic dominance relations, that is necessary and sufficient. As a corollary of this theorem, we can show Theorem 1 of Section 3.1, pertaining to the counter-intuitive behavior of the optimal mechanism for n i.i.d., uniform [ c,c + 1] items: —The first part of this theorem is an extension of Pavlov’s result for = 2 [Pavlov n 2011]. Its proof is by a geometric construction showing that the sufficient condi- tion will be met for all n ’s as long as c is large enough. —The second, and more counter-intuitive, part of the theorem is shown by arguing that all prices p will fail to satisfy 0 . Essentially, what happens in this μ | Z cvx ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

25 Multi-Item Auctions Defying Intuition? · 65 x setting is that all the positive measure is surface measure on the facets + 1, = c i . All other facets and the interior of ~c and there is an atom of +1 at the origin n the cube [ n is large enough, + 1] c,c have negative measure. Given this, if | is large exhibits a phase transition in terms of the price p : if p measure μ Z Z x = c + 1, function has a positive-measure intersection with facet enough that 1 x is violated, 1 u ( is an explicit witness that constraint 0 | μ ) = cvx Z = { c +1 } x 1 ∫ udμ | as > 0. On the other hand, if p is small enough that Z has measure 0 Z intersection with this facet, there turns out not to be enough negative mass in Z to balance the positive atom at ~c , as required for 0 to hold. μ region | Z cvx 5.8.2 Characterization of General Mechanisms. Our characterization of grand- bundling optimality from the previous section naturally extends to arbitrary mech- anisms. Let us briefly discuss this generalization, referring the reader to [Daskalakis et al. 2015] for more details. Consider a mechanism M for some setting n and F . M induces a partition of the typeset X into subsets of types that will decide to purchase different lotteries . Assuming that offers a finite number of lotteries, M M in the menu offered by this partition may look like Figure 8, where each cell c corresponds to a subset of c c c ,t ), where p is a vector of allocation types that will purchase the same lottery ( p c t a price. probabilities and Fig. 8. A partition of the typeset induced by some finite menu of lotteries. If the mechanism is a grand-bundling mechanism, then there are only two regions and Theorem 5 defines a pair of stochastic dominance conditions that are necessary and sufficient for its optimality. Naturally, our generalization to general mechanism will require one condition per cell of the partition. To describe them, let us define c c c vector ~v as follows: p in terms of c = 0 1 , if p i c c : v ∀ i = 1 , if p − = 1 i i , otherwise 0 n ~v ∈{− 1 , 1 , 0 } Moreover, for a vector , let us define a stochastic dominance relation with respect to ~v as follows: ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

26 66 C. Daskalakis · n ~v , 1 , 0 } ∈{− and two measures μ 1 and μ For a vector over X , Definition 5. 2 1 we say that μ μ , iff, with respect to ~v , denoted μ μ convexly dominates 2 1 2 1 ) ( ~v cvx for all measurable and convex functions u that are non-decreasing in all coordinates such that v such that = 1 and non-increasing in all coordinates i i v : = − 1 i i ∫ ∫ ≥ udμ . udμ 1 2 X X μ μ ⇔ μ μ μ and it easily follows that ⇔ Clearly, μ cvx 2 2 1 1 2 1 ~ ~ cvx − 1) ( cvx ( 1) μ μ . So Theorem 5 can be restated as specifying the following necessary and 2 1 2 sufficient conditions for grand-bundling optimality: 0 and 0 μ | Z ~ ~ cvx ( ( − 1) 1) cvx | , where Z and W are the subsets of types that purchase nothing and the grand μ W bundle respectively. Our general characterization is the following: Characterization of General Mechanisms c in the partition of X in- for every cell c , where μ | duced by M : 0 | μ c c ~v ) ( cvx . is optimal) M ( ⇐⇒ μ is the restriction of , defined in (10), to c cell The interested reader is referred to [Daskalakis et al. 2015] for more details. 5.9 Concluding Remarks In Sections 5.1–5.8 we took up the challenge of understanding the structure of multi- item mechanisms, trying to demystify their counter-intuitive behavior identified in Section 3. Our contribution was a duality framework based on which we can certify the optimality of mechanisms in every single-buyer setting. In particular, we showed that the optimal mechanism design problem has a tight dual problem, taking the form of an optimal transportation problem. Given a proposed mechanism, we can test if it is optimal by identifying solutions to the dual achieving the same value, as we did in Section 5.6.1, except that more generally we can also pre-process μ with cost-free convex shuffling operations before finding our transports, as allowed by the D ). Given our strong duality, dual certificates are guaranteed to exist. strong dual ( Moreover, if we do not have a conjectured optimal mechanism for some setting of interest, we can exploit the dual to reverse-engineer it, as we did in Section 5.6.2. Finally, we developed “duality-theory oblivious” tools, characterizing the optimality of mechanisms in terms of the buyer’s distribution. These tools use the power of the duality framework behind the scenes, presenting clean necessary and sufficient conditions for a mechanism to be optimal. All in all, we believe that our framework provides a new perspective on multi- item mechanisms, presenting a tool whose applicability is universal, given our strong duality. Our work opens interesting lines for future investigation, and we are par- ticularly interested in: (1) extending the duality framework to accommodate multiple buyers; and (2) developing technical machinery to facilitate testing stochastic dominance rela- tions. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

27 Multi-Item Auctions Defying Intuition? · 67 While we recognize that a lot remains to be done, we believe that the afore-described framework represents the beginning of a principled approach towards a structural understanding of multi-item multi-buyer mechanisms. 6. THE COMPUTATIONAL COMPLEXITY OF MULTI-ITEM MECHANISMS 6.1 Philosophy The duality based framework of the previous section targeted closed-form charac- terizations of optimal mechanisms. A related goal is to study the computational complexity of optimal mechanisms. In particular, we are interested in whether op- timal mechanisms can be computed and implemented computationally efficiently. There are several reasons why this question is important: —First, if optimal mechanisms were computationally intractable, then why should practitioners care about them, especially in settings with a large number of bid- ders or items? Computational intractability would justify using approximate mechanisms in practical applications. —Moreover, as we have seen, getting closed-form descriptions of optimal mecha- nisms is a challenging task. Despite intense work in the literature, including that of the previous section, we are still far from characterizing optimal multi-bidder mechanisms. When closed-form descriptions are unknown, being able to compute optimal mechanisms is an interesting middle ground. —Additionally, computing optimal mechanisms would be a great tool for re- searchers who want to gain familiarity with the structure of these mechanisms and/or test hypotheses about their structure or performance of approximate so- lutions. —Finally, one might expect that studying optimal mechanism design from an algo- rithmic point of view may reveal structure that might be hard to observe using non-algorithmic tools. For all the above reasons, we find it important to study the computational com- plexity of optimal mechanisms. Over the past few years, our and other groups have made tremendous progress in the complexity of optimal mechanism design [Cai and Daskalakis 2011; Cai et al. 2012a; Alaei et al. 2012; Cai et al. 2012b; Daskalakis et al. 2012; Cai and Huang 2013; Cai et al. 2013a; 2013b; Alaei et al. 2013; Bhalgat et al. 2013; Daskalakis et al. 2014; Chen et al. 2014; Daskalakis and Weinberg 2015; Daskalakis et al. 2015]. See also [Hartline 2013; Chawla and Sivan 2014; Cai et al. 2015] for recent surveys, covering some of this work. We proceed to give a flavor of what is known, restricting our attention to multi-item multi-bidder settings with additive bidders. As discussed below, all our results in this section extend to much broader settings. 6.2 Setting n In this section, we restrict our attention to a seller with m additive items and bidders interested in those items. The type of each bidder is a vector t of values i for the items, which are jointly distributed according to some distribution F . In i j t . We assume that the will denote the value of bidder i for item particular, ij distribution F is known to the seller and all the other bidders, but only bidder i i ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

28 68 C. Daskalakis · knows his realized type. We also assume that bidder types are independent and use ~ . Finally, given a type to denote the types of all bidders, also called the t type profile ~ ~ and , we denote, as is customary, by t i vector the vector containing the types t i − ~ ~ . t t ) to denote . Accordingly, we use ( t i of all bidders except the type of bidder ; i − i The goal of the seller is to design a mechanism that optimizes his expected revenue, when the expectation is computed with respect to the types of the bidders, the randomness in the mechanism, if any, and the randomness in the strategies of the bidders, if any. The mechanism is allowed to be any protocol that interacts with the bidders and has the bidders interact with each other in some fashion. mn 0 , 1 } Whatever the protocol is, it is supposed to output an allocation of x ∈ { . As we items to bidders, where j is given to bidder i indicates whether item x ij assume that there is exactly one copy of each item, any allocation ever output by the mechanism must satisfy that ∑ x . ≤ 1 , for all items j ij i mn the subset of , 1 } We call satisfying the above constraints. The mechanism 0 { F can also charge prices, as long as the bidders accept to pay those prices. While it is hard to optimize over protocols, it follows from the revelation principle that it suffices to optimize over a simpler class of mechanisms called “direct mech- ~ t anisms.” These mechanisms are described by an allocation function X ∆( F ), : 7→ mapping type profiles to distributions over feasible allocations, and a price function m ~ 7→ : R P ) mapping type profiles to distributions over price vectors, and are t ∆( implemented as follows: —The bidders are asked to report their types to the mechanism. ~ ~ ~ t ∼ X ( —If t ) and p ∼ P ( t are the reported types, the mechanism samples ), x x p . and charges prices according to implements allocation ~ ( For convenience, we will use t ) to denote a random variable distributed according X ~ ~ ( t to ), and similarly P ( X t ) to denote a random variable distributed according to ~ P ). t ( While, in principle, the bidders need not be truthful about their types, due to the revelation principle we can also assume without loss of generality that it will be in their best interest to do so. We may also assume that it is not hurtful to them to participate in the mechanism. In particular, we may assume that the mechanism and satisfies Individual Rationality according to is Bayesian Incentive Compatible the following definition. We say that a direct mechanism ( X,P ) is Bayesian Incentive Definition 6. ′ i and types t Compatible (BIC) and t iff for all bidders : in the support of F i i i ′ ′ ~ ~ ~ ~ t , ·X ( (17) t ) −P ( . t )] ≥ E )] t ( −P [ t ) ·X t t ( t ; [ E − i i i i − ~ ~ i i t t − − i i ( We say that a direct mechanism ) satisfies Individual Rationality or is IR iff X,P for all bidders i and types t : in the support of F i i ~ ~ t ) [ t ≥ ·X ( )] (18) . −P ( E t 0 i ~ t − i ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

29 Multi-Item Auctions Defying Intuition? · 69 i cannot be improved by mis- (17) expresses that the expected utility of bidder i is reporting his type, while (18) expresses that the expected utility of bidder non-negative if he is truthful. Both expectations are with respect to the types of the other bidders as well as the randomness in the allocation and price functions, if any. Under the assumption that bidders will report their types truthfully to a BIC, IR mechanism, its revenue can be expressed as follows: ] [ ∑ ~ E P ) (19) . t ( i ~ t i F ,...,F With all the context provided above, the seller is given distributions 1 m over bidder types and seeks to compute a BIC, IR mechanism of optimal revenue. 6.3 Computationally Efficient in What Exactly? When it comes to studying multi-item mechanisms from a computational perspec- tive what we need to answer first is what the input is, and how its description complexity is measured. There are several ways one can go about this, including the following. —One approach is to assume that the bidders’ distributions come from a parametric family of distributions, and specify the parameters of each bidder’s distribution. This approach would allow us to accommodate both discrete and continuous distributions. In this case, the description complexity of the distributions is the description complexity of all parameters required to describe them. —Another approach is to assume that the distributions are discrete and provide them explicitly, by listing the elements in their support and the probability that each element is sampled. The explicit description is reasonable for distributions with a small and discrete support. Here the description complexity of the dis- tributions is the description complexity of all the elements in their support and their associated probabilities. F of each —Finally, one can assume to have sample access to the distribution i bidder. Here each distribution can be thought of as a subroutine that a seller can call to get an independent sample from F . The description complexity of the i distributions is harder to define in this model. One way to do this is to assume that the subroutines only output numbers of certain bit complexity, or truncate all numbers output by these subroutines to some accuracy. We do not want to dwell on this point too much though. We will restrict our attention to the simplest model, assuming that all our dis- tributions are explicit. The parametric and sample-access models are important to study as well, but we are not aware of efficiently computable mechanisms that can accommodate these models in reasonable generality without losing revenue. In fact, we should not expect to get general, exactly optimal algorithms in these settings given the following result: Theorem 6 [Daskalakis et al. 2014]. Consider the algorithmic problem of designing an optimal mechanism for selling n items to a single, additive buyer whose values for the items are independently distributed, according to distributions ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

30 70 · C. Daskalakis that are supported on two rational numbers with rational probabilities. In particular, the distribution of each item { a i ,b is specified by a pair of rational numbers } and i i a rational probability . p i P # P , any mechanism (of any type, direct or indirect) that can ZPP Unless ⊇ be computed in expected polynomial time cannot be both optimal and executable in expected polynomial time. Notice that the mechanism design problem in the statement of Theorem 6 con- forms to the parametric model for describing distributions. And the theorem pro- vides a particularly simple setting where, unless we spent super-polynomial time, we will not be able to compute an efficiently executable mechanism that is also optimal. As the sample-access model is not easier than the parametric one, the same result applies to this model as well. Finally, we will only remark here that # P ⊇ ZPP is a standard complexity-theoretic assumption about the assumption P P , ZPP and # P the relations of complexity classes . The interested reader is referred to standard complexity theory textbooks for more discussion. While Theorem 6 is quite discouraging, especially given the simplicity of the mechanism design problem that is shown intractable, it still does not preclude efficiently computable mechanisms that are near-optimal. In particular, it is an interesting open problem to determine whether there exist efficiently computable mechanisms that achieve a (1 − )-fraction of the optimal revenue, for any desired 0, as long as one is willing to invest time polynomial in 1 / and the accuracy > parameters of the distribution for their computation. This problem is open even in the simple setting of Theorem 6. 6.4 Computing Optimal Mechanisms, and a Bonus We turn to the explicit model of representing bidder distributions, and ask whether optimal mechanisms can be computed efficiently. There is still a wrinkle we need ~ : to overcome however. As we said a direct mechanism is a pair of functions t 7→ X m ~ P : ∆( t F ∆( R ) and ). So to describe these functions explicitly, we need to 7→ specify a distribution over allocations and a distribution over price vectors for every possible type profile. This is problematic though as revealed by a little calculation. Suppose that the support of every bidder’s type distribution has size k . Then to specify each distribution we need to give k − 1 probabilities. So numbers and k ( ) numbers. On the other O mk to describe all these distributions only requires m hand, there are X and k possible type profiles. So we cannot hope to compute explicitly. Besides, even the outputs of these functions are distributions over P high-dimensional spaces. We thus need to be smart about how we represent mechanisms. We need to represent them implicitly, whilst still being able to compute on the implicit rep- resentation. In the additive setting that we consider here, it turns out that the so-called “reduced-form” is a good representation: The reduced form (ˆ x, ˆ p ) of a mechanism ( X,P ) is a collection Definition 7. n = 1 : T of single-variate functions → [0 , 1] ˆ ,i x ,...,m , and ˆ p : T → R ,i = i i i i P 1 , where T as follows. is the support of F ,...,m , which are related to X and i i For all i and types t ∈ T : i i ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

31 Multi-Item Auctions Defying Intuition? · 71 ~ ˆ ( t — ) = E is the proba- ) t ( [ X x ( t ˆ ; -th coordinate of t j ; in particular, the )] x i ij i ij i − i i ~ t − i j to bidder i bility that the mechanism allocates item , conditioning on his reported type being t and assuming that the other bidders report their types truthfully. i ~ t is the expected price charged to ( t ) ) = E t ( p ˆ [ P — ( ˆ ; in particular, ; p t )] i i i i i i − i ~ t − i bidder i , conditioning on his reported type being t and assuming that the other i bidders report their types truthfully. The reduced form is called “reduced” because it is losing information about the mechanism. It can be viewed as projecting ( X,P ), which is a high-dimensional object, to a lower-dimensional space. So maintaining mechanisms in their reduced forms, creates two computational challenges: (1) Given a reduced form (ˆ x, ˆ ), is it possible to verify computationally efficiently p whether it is feasible, that is whether there exists an actual mechanism ( X,P ) x, ˆ whose reduced form agrees with (ˆ )? p (2) Given a reduced form (ˆ ˆ p ) that is feasible, is it possible to implement some x, mechanism with this reduced form computationally efficiently? Border and Che et al. have provided a collection of linear constraints that are necessary and sufficient for reduced-form feasibility [Border 1991; 2007; Che et al. 2011]. See also [Hart and Reny 2014]. These constraints have a nice interpretation as max-flow/min-cut constraints in a related flow network. However, they cannot be used towards computationally efficient algorithms for the above problems, as they are exponentially many. We can improve these results as follows: Theorem 7 [Cai et al. 2012a; Alaei et al. 2012]. The answers to both questions above are “yes.” Namely, given a reduced form (ˆ x, ˆ p ) , we can verify in polynomial time whether it is feasible. Moreover, given a feasible reduced form (ˆ ˆ p ) , we can compute in polynomial time a pair of polynomial-time algorithms for x, ( X,P ) whose reduced sampling the allocation and price functions of a mechanism form is (ˆ x, ˆ p ) . Theorem 7 is very handy as it allows us to formulate polynomial-size linear programs for finding optimal mechanisms. In particular, it is not hard to see the following: —The set of all possible reduced forms is convex, as they are projections of alloca- tion and price functions, which themselves are a convex set as distributions over deterministic allocation and price functions. —Given Theorem 7, there exists a polynomial-time algorithm that determines fea- sibility of reduced forms. This gives a computationally efficient separation oracle for the set of feasible reduced forms. —The expected revenue of a mechanism can be expressed as a linear function of its reduced form. —The BIC and IR constraints are also expressible as linear constraints in the reduced form. —Finally, Theorem 7 implies that, given the reduced form of a mechanism, we can efficiently implement it. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

32 72 · C. Daskalakis Using the Ellipsoid algorithm, the above observations culminate to the following result: Theorem 8 [Cai et al. 2012a]. Consider a mechanism design setting with m items. Given an explicit description of the bidders’ type n additive bidders and distributions, we can compute and implement an optimal mechanism in polynomial- time. Our new theorem provides an important counterpart to our structural results from Section 5, encompassing multi-bidder settings as well. In fact, the theo- rem can be generalized to much broader settings involving more complex alloca- tion constraints [Cai et al. 2012b], budgets [Bhalgat et al. 2013; Daskalakis et al. 2015], non-additive bidders [Cai et al. 2013b], and even going beyond revenue and welfare to other interesting objectives, such as maximizing fairness or minimizing makespan [Daskalakis and Weinberg 2015]. Taken together our algorithmic results provide a very crisp understanding of the complexity of Bayesian mechanism design. In a recent article, we provide a concise overview of our work on this front, and refer the interested reader to this overview [Cai et al. 2015], as well as the surveys mentioned earlier [Hartline 2013; Chawla and Sivan 2014]. Let us conclude this section with a small treat. We promised earlier that the algorithmic perspective might actually reveal structure in the optimal mechanism that could be hard to see otherwise. While developing our algorithmic framework, we encountered remarkable structure in the optimal mechanism that is worth shar- ing here. Recall, from Fact 3 that Myerson’s optimal single-item mechanism is a virtual welfare maximizer. It turns out that this holds in arbitrary settings. Here is the structure of the optimal mechanism for the setting of Theorem 8. When n items are sold to m additive bidders, Theorem 9 [Cai et al. 2012b]. virtual welfare maximizer the optimal mechanism is a , namely: ) The bidders are asked to report their types to the mechanism. Say that the ( 1 reported types are ,...,t . t 1 m , where ) The reported types are transformed into virtual types, h ( t ( ) ,...,h ( t ) 2 1 m m 1 n h : T . → R each maps an additive type t ) to another additive type t ( h i i i i i 3 ) Each item is allocated to the bidder with the highest virtual value for this item, ( with some lexicographic tie-breaking. ( 4 ) Finally, prices are charged so that the mechanism is BIC. So the optimal mechanism has the exact same form in multi-item settings as it has in single-item settings! The differences between Theorem 9 and Fact 3 are the following: —In Myerson’s single-item setting, the virtual transformations h are deterministic, i while in the multi-item setting they are actually randomized. Of course, we knew this had to be the case as by Fact 5 randomization is necessary in multi-item settings. —As emphasized in Fact 3, in Myerson’s setting, each h only depends on bidder i i ’s distribution, but not on the other bidders’ distributions and not even on how many other bidders show up. In the multi-item setting, each h may depend on i ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

33 Multi-Item Auctions Defying Intuition? · 73 i the distributions of all bidders, but importantly its argument is only bidder ’s type. Despite these differences, it is remarkable that the same structure applies to both single-item and multi-item settings. Again the above structure generalizes well-beyond the additive setting—see [Cai et al. 2013b; 2015]. 7. CONCLUSIONS In the past few decades, we witnessed the tremendous impact of Myerson’s re- sult [Myerson 1981] in mechanism design. Building on his result, we now understand virtually every aspect of revenue optimization in single-item settings. Moreover, we have seen numerous extensions, robustifying the result with respect to the details of the bidders’ distributions, and expanding its applicability to a myriad of single- dimensional settings, accommodating budgets, online arrivals and departures of bidders, complex allocation constraints, non-linear designer objectives, and many more. Algorithmic and approximation techniques have played an important role in exploring these extensions, and indeed it has been thanks to the sharpness and simplicity of Myerson’s result that this interplay between computation and mech- anism design has been so fruitful. See [Hartline 2013; Chawla and Sivan 2014; Roughgarden 2015] for recent surveys of this literature. Unfortunately, multi-item revenue optimization has not enjoyed the same fate due to our lack of understanding of multi-item mechanisms. Indeed, optimal multi- item mechanisms exhibit such rich structure that it is not clear whether there is a lens through which we can gain a crisp understanding of their properties. In this survey, we provided two approaches, based on duality theory and optimization, through which we obtained a fresh perspective on multi-item mechanism design. We have used these approaches to characterize the structure of multi-item mechanisms and showcased procedures, both analytical and algorithmic, via which the optimal mechanism can be identified. We believe that the results presented in this survey have begun to resemble a cohesive theory of multi-item auctions, opening exciting directions for future investigation. To identify just a few: —It is important to extend the duality-based framework of Section 5 to accommo- date multiple bidders. —What is the sensitivity of the structural and algorithmic results on the details of the bidder type distributions? —In settings where the seller knows the bidder distributions, but the bidders do not, what is the Bayesian-optimal dominant strategy truthful mechanism? Ultimately, we feel that the foundations have been laid for exciting developments in optimal multi-item mechanism design, expecting a lot more progress on this front in the next years. Acknowledgments We thank Yang Cai, Shaddin Dughmi, Christos Tzamos and Matt Weinberg for valuable comments on drafts of this survey. We thank Christos for all the figures. ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

34 74 · C. Daskalakis REFERENCES Alaei, S. , Haghpanah, N. , Hartline, J. , and Malekian, A. 2012. Bayesian Optimal , Fu, H. Auctions via Multi- to Single-agent Reduction. In the 13th ACM Conference on Electronic Commerce (EC) . , Haghpanah, N. Alaei, S. and Hartline, J. D. 2013. The Simple Economics of Approx- Fu, H. , , imately Optimal Auctions. In the 54th Annual IEEE Symposium on Foundations of Computer . Science (FOCS) Anderson, E. J. and Nash, P. Linear Programming in Infinite-Dimensional Spaces: 1987. . John Wiley & Sons. Theory and Applications 2014. A Simple and Ap- , Babaioff, M. , and Weinberg, S. M. , Immorlica, N. Lucier, B. the 55th Annual Symposium on proximately Optimal Mechanism for an Additive Buyer. In Foundations of Computer Science (FOCS) . , Gollapudi, S. , and Munagala, K. Bhalgat, A. 2013. Optimal Auctions via the Multiplicative Weight Method. In . the 14th ACM Conference on Electronic Commerce (EC) 1991. Implementation of Reduced Form Auctions: A Geometric Approach. Econo- Border, K. C. metrica 59, 4, 1175–1187. Border, K. C. Economic Theory 31 , 167–181. 2007. Reduced Form Auctions Revisited. Chawla, S. , Briest, P. , and Weinberg, S. M. 2010. Pricing Randomized Allo- , Kleinberg, R. the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) . cations. In 2011. Extreme-Value Theorems for Optimal Multidimensional Cai, Y. and Daskalakis, C. the 52nd Annual IEEE Symposium on Foundations of Computer Science (FOCS) Pricing. In . , , and Weinberg, S. M. 2012a. An Algorithmic Characterization of Cai, Y. Daskalakis, C. Multi-Dimensional Mechanisms. In the 44th Annual ACM Symposium on Theory of Computing . (STOC) , Daskalakis, C. , and Weinberg, S. M. 2012b. Optimal Multi-Dimensional Mechanism Cai, Y. Design: Reducing Revenue to Welfare Maximization. In the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS) . Cai, Y. , , and Weinberg, S. M. 2013a. Reducing Revenue to Welfare Maximiza- Daskalakis, C. the 24th Annual ACM-SIAM tion : Approximation Algorithms and other Generalizations. In Symposium on Discrete Algorithms (SODA) . Daskalakis, C. , and Weinberg, S. M. , Cai, Y. 2013b. Understanding Incentives: Mechanism Design Becomes Algorithm Design. In the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS) . , Daskalakis, C. , and Weinberg, S. M. Cai, Y. 2015. Reducing Bayesian Mechanism Design to Algorithm Design. . Encyclopedia of Algorithms 2013. Simple and Nearly Optimal Multi-Item Auctions. In Cai, Y. and Huang, Z. the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) . , Hartline, J. D. Chawla, S. and Kleinberg, R. D. 2007. Algorithmic Pricing via Virtual , Valuations. In the 8th ACM Conference on Electronic Commerce (EC) . Chawla, S. , Hartline, J. D. , Malec, D. L. , and Sivan, B. 2010. Multi-Parameter Mechanism Design and Sequential Posted Pricing. In the 42nd ACM Symposium on Theory of Computing . (STOC) 2014. Bayesian Algorithmic Mechanism Design. ACM SIGecom Chawla, S. and Sivan, B. Exchanges 13, 1, 5–49. , Che, Y.-K. , and Mierendorff, K. 2011. Generalized Reduced-Form Auctions: A Kim, J. Network-Flow Approach. Econometrica 81, 6, 2487–2520. Chen, X. , Diakonikolas, I. , Paparas, D. , Sun, X. , and Yannakakis, M. 2014. The Complexity of Optimal Multidimensional Pricing. In . the 25th Symposium on Discrete Algorithms (SODA) Daskalakis, C. Spring 2015. Lecture 21. In Decision, Games, and Computation . EECS, MIT. https://stellar.mit.edu/S/course/6/sp15/6.891/index.html . Daskalakis, C. , Deckelbaum, A. , and Tzamos, C. 2012. Optimal Pricing is Hard. In the 8th Workshop on Internet & Network Economics (WINE) . ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75

35 Multi-Item Auctions Defying Intuition? · 75 , Deckelbaum, A. and Tzamos, C. 2013. Mechanism Design via Optimal Trans- Daskalakis, C. , . port. In the 14th ACM Conference on Electronic Commerce (EC) , and Tzamos, C. 2014. The Complexity of Optimal Mechanism , Deckelbaum, A. Daskalakis, C. the 25th ACM-SIAM Symposium on Discrete Algorithms (SODA) . Design. In Deckelbaum, A. Daskalakis, C. and Tzamos, C. 2015. Strong Duality for a Multiple-Good , , the 16th ACM Conference on Electronic Commerce (EC) . Monopoly. In Daskalakis, C. , Devanur, N. , and Weinberg, S. M. 2015. Revenue Maximization and Ex-Post Budget Constraints. In . the 16th ACM Conference on Electronic Commerce (EC) Daskalakis, C. and Weinberg, S. M. 2015. Bayesian Truthful Mechanisms for Job Scheduling the 26th Annual ACM-SIAM Symposium on from Bi-criterion Approximation Algorithms. In Discrete Algorithms (SODA) . 2014. Duality and Optimality of Auctions for Uni- Giannakopoulos, Y. and Koutsoupias, E. form Distributions. In the 15th ACM Conference on Electronic Commerce (EC) . 2012. Approximate Revenue Maximization with Multiple Items. In the Hart, S. and Nisan, N. 13th ACM Conference on Electronic Commerce (EC) . Hart, S. and Nisan, N. the 14th ACM 2013. The Menu-Size Complexity of Auctions. In Conference on Electronic Commerce (EC) . 2012. Maximal Revenue with Multiple Goods: Nonmonotonicity and Hart, S. and Reny, P. Other Observations . Center for the Study of Rationality. Hart, S. and Reny, P. J. 2014. Implementation of Reduced Form Mechanisms: A Simple Ap- proach and a New Characterization. Economic Theory Bulletin 3, 1, 1–8. Hartline, J. D. 2013. Bayesian Mechanism Design. Foundations and Trends in Theoretical Computer Science 8, 3. 2013. On Revenue Maximization for Selling Multiple Independently Li, X. and Yao, A. C.-C. 28, 11232–11237. Distributed Items. Proceedings of the National Academy of Sciences 110, Optimization by Vector Space Methods . John Wiley & Sons. 1968. Luenberger, D. G. 2006. Bundling as an Optimal Selling Mechanism for a Multiple- Manelli, A. and Vincent, D. Good Monopolist. Journal of Economic Theory 127, 1, 1–35. 2007. Multidimensional Mechanism Design: Revenue Max- Manelli, A. M. and Vincent, D. R. Journal of Economic Theory 137, imization and the Multiple-Good Monopoly. 1, 153–185. 1981. Optimal Auction Design. 1, 58–73. Myerson, R. B. Mathematics of Operations Research 6, 2011. Optimal Mechanism for Selling Two Goods. Pavlov, G. The B.E. Journal of Theoretical Economics 11, 1 (February), 1–35. Riley, J. and Zeckhauser, R. 1983. Optimal Selling Strategies: When to Haggle, When to Hold Firm. The Quarterly Journal of Economics , 267–289. Rochet, J.-C. 1987. A Necessary and Sufficient Condition for Rationalizability in a Quasi-Linear Journal of Mathematical Economics 16, Context. 2 (April), 191–200. Rochet, J.-C. and Stole, L. A. 2003. The Economics of Multidimensional Screening. Econo- metric Society Monographs 35 , 150–197. Roughgarden, T. 2015. Approximately Optimal Mechanism Design: Motivation, Examples, and Lessons Learned. ACM SIGecom Exchanges 13, 2, 4–20. Rubinstein, A. and Weinberg, S. M. 2015. Simple Mechanisms for a Subadditive Buyer and Applications to Revenue Monotonicity. In the 16th ACM Conference on Electronic Commerce (EC) . Tzamos, C. 2015. Optimal Mechanisms for Pairs of Beta Distributions. http://christos.me/betas.html . Yao, A. C. 2015. An n -to-1 Bidder Reduction for Multi-item Auctions and its Applications. In the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) . ACM SIGecom Exchanges, Vol. 14, No. 1, June 2015, Pages 41–75