Author Description

Login to generate an author description

Ask a Question About This Mathematician

All published works (18)

We study the problem of minimizing swap regret in structured normal-form games. Players have a very large (potentially infinite) number of pure actions, but each action has an embedding into … We study the problem of minimizing swap regret in structured normal-form games. Players have a very large (potentially infinite) number of pure actions, but each action has an embedding into $d$-dimensional space and payoffs are given by bilinear functions of these embeddings. We provide an efficient learning algorithm for this setting that incurs at most $\tilde{O}(T^{(d+1)/(d+3)})$ swap regret after $T$ rounds. To achieve this, we introduce a new online learning problem we call \emph{full swap regret minimization}. In this problem, a learner repeatedly takes a (randomized) action in a bounded convex $d$-dimensional action set $\mathcal{K}$ and then receives a loss from the adversary, with the goal of minimizing their regret with respect to the \emph{worst-case} swap function mapping $\mathcal{K}$ to $\mathcal{K}$. For varied assumptions about the convexity and smoothness of the loss functions, we design algorithms with full swap regret bounds ranging from $O(T^{d/(d+2)})$ to $O(T^{(d+1)/(d+2)})$. Finally, we apply these tools to the problem of online forecasting to minimize calibration error, showing that several notions of calibration can be viewed as specific instances of full swap regret. In particular, we design efficient algorithms for online forecasting that guarantee at most $O(T^{1/3})$ $\ell_2$-calibration error and $O(\max(\sqrt{\epsilon T}, T^{1/3}))$ \emph{discretized-calibration} error (when the forecaster is restricted to predicting multiples of $\epsilon$).
We propose efficient no-regret learning dynamics and ellipsoid-based methods for computing linear correlated equilibria$\unicode{x2014}$a relaxation of correlated equilibria and a strengthening of coarse correlated equilibria$\unicode{x2014}$in general convex games. These are … We propose efficient no-regret learning dynamics and ellipsoid-based methods for computing linear correlated equilibria$\unicode{x2014}$a relaxation of correlated equilibria and a strengthening of coarse correlated equilibria$\unicode{x2014}$in general convex games. These are games where the number of pure strategies is potentially exponential in the natural representation of the game, such as extensive-form games. Our work identifies linear correlated equilibria as the tightest known notion of equilibrium that is computable in polynomial time and is efficiently learnable for general convex games. Our results are enabled by a generalization of the seminal framework of of Gordon et al. [2008] for $\Phi$-regret minimization, providing extensions to this framework that can be used even when the set of deviations $\Phi$ is intractable to separate/optimize over. Our polynomial-time algorithms are similarly enabled by extending the Ellipsoid-Against-Hope approach of Papadimitriou and Roughgarden [2008] and its generalization to games of non-polynomial type proposed by Farina and Pipis [2024a]. We provide an extension to these approaches when we do not have access to the separation oracles required by these works for the dual player.
A set of probabilistic forecasts is calibrated if each prediction of the forecaster closely approximates the empirical distribution of outcomes on the subset of timesteps where that prediction was made. … A set of probabilistic forecasts is calibrated if each prediction of the forecaster closely approximates the empirical distribution of outcomes on the subset of timesteps where that prediction was made. We study the fundamental problem of online calibrated forecasting of binary sequences, which was initially studied by Foster & Vohra (1998). They derived an algorithm with $O(T^{2/3})$ calibration error after $T$ time steps, and showed a lower bound of $\Omega(T^{1/2})$. These bounds remained stagnant for two decades, until Qiao & Valiant (2021) improved the lower bound to $\Omega(T^{0.528})$ by introducing a combinatorial game called sign preservation and showing that lower bounds for this game imply lower bounds for calibration. We introduce a strengthening of Qiao & Valiant's game that we call sign preservation with reuse (SPR). We prove that the relationship between SPR and calibrated forecasting is bidirectional: not only do lower bounds for SPR translate into lower bounds for calibration, but algorithms for SPR also translate into new algorithms for calibrated forecasting. In particular, any strategy that improves the trivial upper bound for the value of the SPR game would imply a forecasting algorithm with calibration error exponent less than 2/3, improving Foster & Vohra's upper bound for the first time. Using similar ideas, we then prove a slightly stronger lower bound than that of Qiao & Valiant, namely $\Omega(T^{0.54389})$. Our lower bound is obtained by an oblivious adversary, marking the first $\omega(T^{1/2})$ calibration lower bound for oblivious adversaries.
While ERM suffices to attain near-optimal generalization error in the stochastic learning setting, this is not known to be the case in the online learning setting, where algorithms for general … While ERM suffices to attain near-optimal generalization error in the stochastic learning setting, this is not known to be the case in the online learning setting, where algorithms for general concept classes rely on computationally inefficient oracles such as the Standard Optimal Algorithm (SOA). In this work, we propose an algorithm for online binary classification setting that relies solely on ERM oracle calls, and show that it has finite regret in the realizable setting and sublinearly growing regret in the agnostic setting. We bound the regret in terms of the Littlestone and threshold dimensions of the underlying concept class. We obtain similar results for nonparametric games, where the ERM oracle can be interpreted as a best response oracle, finding the best response of a player to a given history of play of the other players. In this setting, we provide learning algorithms that only rely on best response oracles and converge to approximate-minimax equilibria in two-player zero-sum games and approximate coarse correlated equilibria in multi-player general-sum games, as long as the game has a bounded fat-threshold dimension. Our algorithms apply to both binary-valued and real-valued games and can be viewed as providing justification for the wide use of double oracle and multiple oracle algorithms in the practice of solving large games.
We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour [BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness … We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour [BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness of the space of actions. We show that, whenever there exists a no-external-regret algorithm for some hypothesis class, there must also exist a no-swap-regret algorithm for that same class. For the problem of learning with expert advice, our result implies that it is possible to guarantee that the swap regret is bounded by {\epsilon} after $\log(N)^{O(1/\epsilon)}$ rounds and with $O(N)$ per iteration complexity, where $N$ is the number of experts, while the classical reductions of Blum-Mansour and Stolz-Lugosi require $O(N/\epsilon^2)$ rounds and at least $\Omega(N^2)$ per iteration complexity. Our result comes with an associated lower bound, which -- in contrast to that in [BM07] -- holds for oblivious and $\ell_1$-constrained adversaries and learners that can employ distributions over experts, showing that the number of rounds must be $\tilde\Omega(N/\epsilon^2)$ or exponential in $1/\epsilon$. Our reduction implies that, if no-regret learning is possible in some game, then this game must have approximate correlated equilibria, of arbitrarily good approximation. This strengthens the folklore implication of no-regret learning that approximate coarse correlated equilibria exist. Importantly, it provides a sufficient condition for the existence of correlated equilibrium which vastly extends the requirement that the action set is finite, thus answering a question left open by [DG22; Ass+23]. Moreover, it answers several outstanding questions about equilibrium computation and learning in games.
We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions, which applies to both truthful and nontruthful auctions. Given a (not necessarily truthful) single-item auction format … We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions, which applies to both truthful and nontruthful auctions. Given a (not necessarily truthful) single-item auction format satisfying certain technical conditions, we run simultaneous item auctions augmented with a personalized entry fee for each bidder that must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types and in particular are independent of realized bids. We bound the revenue of the resulting two-part tariff mechanism using a novel geometric technique that enables revenue guarantees for many common nontruthful auctions that previously had none. Our approach adapts and extends the duality framework of Cai, Devanur, and Weinberg [SIAM J. Comput., 50 (2021), pp. STOC16-160–STOC16-200] beyond truthful auctions. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. Our results for first-price and all-pay are the first revenue guarantees of nontruthful mechanisms in multidimensional environments, addressing an open question in the literature [T. Roughgarden, V. Syrgkanis, and E. Tardos, J. Artificial Intelligence Res., 59 (2017), pp. 59–101]. If all-pay auctions are used, we prove that the resulting mechanism is also credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. This is the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. If second-price auctions are used, we obtain a truthful -approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques.
Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player … Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game. We extend their result from external regret to internal regret and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of $\tilde{O}(T^{-1})$. This substantially improves over the prior best rate of convergence for correlated equilibria of $O(T^{-3/4})$ due to Chen and Peng (NeurIPS`20), and it is optimal -- within the no-regret framework -- up to polylogarithmic factors in $T$. To obtain these results, we develop new techniques for establishing higher-order smoothness for learning dynamics involving fixed point operations. Specifically, we establish that the no-internal-regret learning dynamics of Stoltz and Lugosi (Mach Learn`05) are equivalently simulated by no-external-regret dynamics on a combinatorial space. This allows us to trade the computation of the stationary distribution on a polynomial-sized Markov chain for a (much more well-behaved) linear transformation on an exponential-sized set, enabling us to leverage similar techniques as DFG to near-optimally bound the internal regret. Moreover, we establish an $O(\textrm{polylog}(T))$ no-swap-regret bound for the classic algorithm of Blum and Mansour (BM) (JMLR`07). We do so by introducing a technique based on the Cauchy Integral Formula that circumvents the more limited combinatorial arguments of DFG. In addition to shedding clarity on the near-optimal regret guarantees of BM, our arguments provide insights into the various ways in which the techniques by DFG can be extended and leveraged in the analysis of more involved learning algorithms.
A classic result of Marcus and Tardos (previously known as the Stanley-Wilf conjecture) bounds from above the number of $n$-permutations ($\sigma \in S_n$) that do not contain a specific sub-permutation. … A classic result of Marcus and Tardos (previously known as the Stanley-Wilf conjecture) bounds from above the number of $n$-permutations ($\sigma \in S_n$) that do not contain a specific sub-permutation. In particular, it states that for any fixed permutation $\pi$, the number of $n$-permutations that avoid $\pi$ is at most exponential in $n$. In this paper, we generalize this result. We bound the number of avoidant $n$-permutations even if they only have to avoid $\pi$ at specific indices. We consider a $k$-uniform hypergraph $\Lambda$ on $n$ vertices and count the $n$-permutations that avoid $\pi$ at the indices corresponding to the edges of $\Lambda$. We analyze both the random and deterministic hypergraph cases. This problem was originally proposed by Asaf Ferber.
 When $\Lambda$ is a random hypergraph with edge density $\alpha$, we show that the expected number of $\Lambda$-avoiding $n$-permutations is bounded (both upper and lower) as $\exp(O(n))\alpha^{-\frac{n}{k-1}}$, using a supersaturation version of F\"{u}redi-Hajnal.
 In the deterministic case we show that, for $\Lambda$ containing many size $L$ cliques, the number of $\Lambda$-avoiding $n$-permutations is $O\left(\frac{n\log^{2+\epsilon}n}{L}\right)^n$, giving a nontrivial bound with $L$ polynomial in $n$. Our main tool in the analysis of this deterministic case is the new and revolutionary hypergraph containers method, developed in papers of Balogh-Morris-Samotij and Saxton-Thomason.
We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of … We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after $T$ rounds of interaction, each player experiences total regret that is ${\rm poly}(\log T)$. Our bound improves, exponentially, the $O({T}^{1/2})$ regret attainable by standard no-regret learners in games, the $O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the ${O}(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Pen, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of $\tilde{O}\left(\frac 1T\right)$.
Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player … Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game. We extend their result from external regret to internal regret and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of $\tilde{O}(T^{-1})$. This substantially improves over the prior best rate of convergence for correlated equilibria of $O(T^{-3/4})$ due to Chen and Peng (NeurIPS`20), and it is optimal -- within the no-regret framework -- up to polylogarithmic factors in $T$. To obtain these results, we develop new techniques for establishing higher-order smoothness for learning dynamics involving fixed point operations. Specifically, we establish that the no-internal-regret learning dynamics of Stoltz and Lugosi (Mach Learn`05) are equivalently simulated by no-external-regret dynamics on a combinatorial space. This allows us to trade the computation of the stationary distribution on a polynomial-sized Markov chain for a (much more well-behaved) linear transformation on an exponential-sized set, enabling us to leverage similar techniques as DGF to near-optimally bound the internal regret. Moreover, we establish an $O(\textrm{polylog}(T))$ no-swap-regret bound for the classic algorithm of Blum and Mansour (BM) (JMLR`07). We do so by introducing a technique based on the Cauchy Integral Formula that circumvents the more limited combinatorial arguments of DFG. In addition to shedding clarity on the near-optimal regret guarantees of BM, our arguments provide insights into the various ways in which the techniques by DFG can be extended and leveraged in the analysis of more involved learning algorithms.
We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of … We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after $T$ rounds of interaction, each player experiences total regret that is ${\rm poly}(\log T)$. Our bound improves, exponentially, the $O({T}^{1/2})$ regret attainable by standard no-regret learners in games, the $O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the ${O}(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Pen, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of $\tilde{O}\left(\frac 1T\right)$.
We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of … We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after $T$ rounds of interaction, each player experiences total regret that is ${\rm poly}(\log T)$. Our bound improves, exponentially, the $O({T}^{1/2})$ regret attainable by standard no-regret learners in games, the $O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the ${O}(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Pen, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of $\tilde{O}\left(\frac 1T\right)$.
We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions. Our approach adapts the duality framework of Cai, Devanur and Weinberg (STOC 2016) and applies to … We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions. Our approach adapts the duality framework of Cai, Devanur and Weinberg (STOC 2016) and applies to both truthful and non-truthful auctions. Given a (not necessarily truthful) single-item auction format 'A' satisfying certain technical conditions, we run simultaneous item auctions augmented with a personalized entry fee for each bidder that must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types, and in particular are independent of realized bids. We bound the revenue of the resulting two-part tariff mechanism using a novel geometric technique that enables revenue guarantees for many common non-truthful auctions that previously had none. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. Our results for first price and all-pay are the first revenue guarantees of non-truthful mechanisms in multi-dimensional environments, addressing an open question in the literature. If all-pay auctions are used, we prove that the resulting mechanism is also credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. This is the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. If second-price auctions are used, we obtain a truthful O(1)-approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques.
We identify the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. This is one instance of a more general framework for … We identify the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. This is one instance of a more general framework for designing two-part tariff auctions, adapting the duality framework of Cai et al [CDW16]. Given a (not necessarily incentive compatible) auction format $A$ satisfying certain technical conditions, our framework augments the auction with a personalized entry fee for each bidder, which must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types, and in particular are independent of realized bids. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. If all-pay auctions are used, we prove that the resulting mechanism is credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. If second-price auctions are used, we obtain a truthful $O(1)$-approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques. Our results for first price and all-pay are the first revenue guarantees of non-truthful mechanisms in multi-dimensional environments; an open question in the literature [RST17].
We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions, which applies to both truthful and non-truthful auctions. Given a (not necessarily truthful) single-item auction format … We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions, which applies to both truthful and non-truthful auctions. Given a (not necessarily truthful) single-item auction format $A$ satisfying certain technical conditions, we run simultaneous item auctions augmented with a personalized entry fee for each bidder that must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types, and in particular are independent of realized bids. We bound the revenue of the resulting two-part tariff mechanism using a novel geometric technique that enables revenue guarantees for many common non-truthful auctions that previously had none. Our approach adapts and extends the duality framework of Cai et al [CDW16] beyond truthful auctions. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. Our results for first price and all-pay are the first revenue guarantees of non-truthful mechanisms in multi-dimensional environments, addressing an open question in the literature [RST17]. If all-pay auctions are used, we prove that the resulting mechanism is also credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. This is the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. If second-price auctions are used, we obtain a truthful $O(1)$-approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques.
We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain … We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain relatively sharp bounds in the case where $\Lambda$ is a random hypergraph, and find bounds in the case where $\Lambda$ contains many large cliques.
We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain … We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain relatively sharp bounds in the case where $\Lambda$ is a random hypergraph, and find bounds in the case where $\Lambda$ contains many large cliques. Along the way, we prove a supersaturation version of Furedi-Hajnal, which may be of independent interest.
We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain … We consider the problem of bounding the number of permutations $\sigma\in S_n$ that avoid a fixed permutation $\pi\in S_k$ in specific indices given by a $k$-uniform hypergraph $\Lambda$. We obtain relatively sharp bounds in the case where $\Lambda$ is a random hypergraph, and find bounds in the case where $\Lambda$ contains many large cliques. Along the way, we prove a supersaturation version of F\"uredi-Hajnal, which may be of independent interest.

Commonly Cited References

Many important theorems and conjectures in combinatorics, such as the theorem of Szemerédi on arithmetic progressions and the Erdős–Stone Theorem in extremal graph theory, can be phrased as statements about … Many important theorems and conjectures in combinatorics, such as the theorem of Szemerédi on arithmetic progressions and the Erdős–Stone Theorem in extremal graph theory, can be phrased as statements about families of independent sets in certain uniform hypergraphs. In recent years, an important trend in the area has been to extend such classical results to the so-called ‘sparse random setting’. This line of research has recently culminated in the breakthroughs of Conlon and Gowers and of Schacht, who developed general tools for solving problems of this type. Although these two articles solved very similar sets of longstanding open problems, the methods used are very different from one another and have different strengths and weaknesses. In this article, we provide a third, completely different, approach to proving extremal and structural results in sparse random sets that also yields their natural ‘counting’ counterparts. We give a structural characterization of the independent sets in a large class of uniform hypergraphs by showing that every independent set is almost contained in one of a small number of relatively sparse sets. We then derive many interesting results as fairly straightforward consequences of this abstract theorem. In particular, we prove the well-known conjecture of Kohayakawa, Łuczak, and Rödl, a probabilistic embedding lemma for sparse graphs. We also give alternative proofs of many of the results of Conlon and Gowers and of Schacht, such as sparse random versions of Szemerédi’s theorem, the Erdős–Stone Theorem, and the Erdős–Simonovits Stability Theorem, and obtain their natural ‘counting’ versions, which in some cases are considerably stronger. For example, we show that for each positive <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="beta"> <mml:semantics> <mml:mi>β<!-- β --></mml:mi> <mml:annotation encoding="application/x-tex">\beta</mml:annotation> </mml:semantics> </mml:math> </inline-formula> and integer <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="k"> <mml:semantics> <mml:mi>k</mml:mi> <mml:annotation encoding="application/x-tex">k</mml:annotation> </mml:semantics> </mml:math> </inline-formula>, there are at most <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="StartBinomialOrMatrix beta n Choose m EndBinomialOrMatrix"> <mml:semantics> <mml:mrow> <mml:mstyle scriptlevel="0"> <mml:mrow class="MJX-TeXAtom-OPEN"> <mml:mo maxsize="1.2em" minsize="1.2em">(</mml:mo> </mml:mrow> </mml:mstyle> <mml:mfrac linethickness="0"> <mml:mrow> <mml:mi>β<!-- β --></mml:mi> <mml:mi>n</mml:mi> </mml:mrow> <mml:mi>m</mml:mi> </mml:mfrac> <mml:mstyle scriptlevel="0"> <mml:mrow class="MJX-TeXAtom-CLOSE"> <mml:mo maxsize="1.2em" minsize="1.2em">)</mml:mo> </mml:mrow> </mml:mstyle> </mml:mrow> <mml:annotation encoding="application/x-tex">\binom {\beta n}{m}</mml:annotation> </mml:semantics> </mml:math> </inline-formula> sets of size <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="m"> <mml:semantics> <mml:mi>m</mml:mi> <mml:annotation encoding="application/x-tex">m</mml:annotation> </mml:semantics> </mml:math> </inline-formula> that contain no <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="k"> <mml:semantics> <mml:mi>k</mml:mi> <mml:annotation encoding="application/x-tex">k</mml:annotation> </mml:semantics> </mml:math> </inline-formula>-term arithmetic progression, provided that <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="m greater-than-or-slanted-equals upper C n Superscript 1 minus 1 slash left-parenthesis k minus 1 right-parenthesis"> <mml:semantics> <mml:mrow> <mml:mi>m</mml:mi> <mml:mo>⩾<!-- ⩾ --></mml:mo> <mml:mi>C</mml:mi> <mml:msup> <mml:mi>n</mml:mi> <mml:mrow class="MJX-TeXAtom-ORD"> <mml:mn>1</mml:mn> <mml:mo>−<!-- − --></mml:mo> <mml:mn>1</mml:mn> <mml:mrow class="MJX-TeXAtom-ORD"> <mml:mo>/</mml:mo> </mml:mrow> <mml:mo stretchy="false">(</mml:mo> <mml:mi>k</mml:mi> <mml:mo>−<!-- − --></mml:mo> <mml:mn>1</mml:mn> <mml:mo stretchy="false">)</mml:mo> </mml:mrow> </mml:msup> </mml:mrow> <mml:annotation encoding="application/x-tex">m \geqslant Cn^{1-1/(k-1)}</mml:annotation> </mml:semantics> </mml:math> </inline-formula>, where <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper C"> <mml:semantics> <mml:mi>C</mml:mi> <mml:annotation encoding="application/x-tex">C</mml:annotation> </mml:semantics> </mml:math> </inline-formula> is a constant depending only on <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="beta"> <mml:semantics> <mml:mi>β<!-- β --></mml:mi> <mml:annotation encoding="application/x-tex">\beta</mml:annotation> </mml:semantics> </mml:math> </inline-formula> and <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="k"> <mml:semantics> <mml:mi>k</mml:mi> <mml:annotation encoding="application/x-tex">k</mml:annotation> </mml:semantics> </mml:math> </inline-formula>. We also obtain new results, such as a sparse version of the Erdős–Frankl–Rödl Theorem on the number of <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper H"> <mml:semantics> <mml:mi>H</mml:mi> <mml:annotation encoding="application/x-tex">H</mml:annotation> </mml:semantics> </mml:math> </inline-formula>-free graphs and, as a consequence of the KŁR conjecture, we extend a result of Rödl and Ruciński on Ramsey properties in sparse random graphs to the general, non-symmetric setting.
We initiate the study of efficient mechanism design with guaranteed good properties even when players participate in multiple mechanisms simultaneously or sequentially. We define the class of smooth mechanisms, related … We initiate the study of efficient mechanism design with guaranteed good properties even when players participate in multiple mechanisms simultaneously or sequentially. We define the class of smooth mechanisms, related to smooth games defined by Roughgarden, that can be thought of as mechanisms that generate approximately market clearing prices. We show that smooth mechanisms result in high quality outcome both in equilibrium and in learning outcomes in the full information setting, as well as in Bayesian equilibrium with uncertainty about participants. Our main result is to show that smooth mechanisms compose well: smoothness locally at each mechanism implies global efficiency.
We prove that Bimatrix, the problem of finding a Nash equilibrium in a two-player game, is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou … We prove that Bimatrix, the problem of finding a Nash equilibrium in a two-player game, is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou in 1991. Our result, building upon the work of Daskalakis et al. [2006a] on the complexity of four-player Nash equilibria, settles a long standing open problem in algorithmic game theory. It also serves as a starting point for a series of results concerning the complexity of two-player Nash equilibria. In particular, we prove the following theorems: —Bimatrix does not have a fully polynomial-time approximation scheme unless every problem in PPAD is solvable in polynomial time. —The smoothed complexity of the classic Lemke-Howson algorithm and, in fact, of any algorithm for Bimatrix is not polynomial unless every problem in PPAD is solvable in randomized polynomial time. Our results also have a complexity implication in mathematical economics: —Arrow-Debreu market equilibria are PPAD -hard to compute.
Fictitious play is a natural dynamic for equilibrium play in zero-sum games, proposed by Brown [6], and shown to converge by Robinson [33]. Samuel Karlin conjectured in 1959 that fictitious … Fictitious play is a natural dynamic for equilibrium play in zero-sum games, proposed by Brown [6], and shown to converge by Robinson [33]. Samuel Karlin conjectured in 1959 that fictitious play converges at rate O(t <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-1/2</sup> ) with respect to the number of steps t. We disprove this conjecture by showing that, when the payoff matrix of the row player is the n × n identity matrix, fictitious play may converge (for some tie-breaking) at rate as slow as Ω(t <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-1/n</sup> ).
Algorithmic pricing is the computational problem that sellers (e.g.,in supermarkets) face when trying to set prices for their items to maximize their profit in the presence of a known demand. … Algorithmic pricing is the computational problem that sellers (e.g.,in supermarkets) face when trying to set prices for their items to maximize their profit in the presence of a known demand. Guruswami etal. (SODA, 2005) proposed this problem and gave logarithmic approximations (in the number of consumers) for the unit-demand and single-parameter cases where there is a specific set of consumers and their valuations for bundles are known precisely. Subsequently several versions of the problem have been shown to have poly-logarithmic in approximability. This problem has direct ties to the important open question of better understanding the Bayesian optimal mechanism in multi-parameter agent settings; however, for this purpose approximation factors logarithmic in the number of agents are inadequate. It is therefore of vital interest to consider special cases where constant approximations are possible. We consider the unit-demand variant of this pricing problem. Here a consumer has a valuation for each different item and their value for aset of items is simply the maximum value they have for any item in the set. Instead of considering a set of consumers with precisely known preferences, like the prior algorithmic pricing literature, we assume that the preferences of the consumers are drawn from a distribution. This is the standard assumption in economics; furthermore, the setting of a specific set of customers with specific preferences, which is employed in all of the prior work in algorithmic pricing, is a special case of this general Bayesian pricing problem, where there is a discrete Bayesian distribution for preferences specified by picking one consumer uniformly from the given set of consumers. Notice that the distribution over the valuations for the individual items that this generates is obviously correlated. Our work complements these existing works by considering the case where the consumer's valuations for the different items are independent random variables. Our main result is a constant approximation algorithm for this problem that makes use of an interesting connection between this problem and the concept of virtual valuations from the single-parameter Bayesian optimal mechanism design literature.
We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror Prox algorithm for offline optimization, prove … We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror Prox algorithm for offline optimization, prove an extension to Holder-smooth functions, and apply the results to saddle-point type problems. Next, we prove that a version of Optimistic Mirror Descent (which has a close relation to the Exponential Weights algorithm) can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T). This addresses a question of Daskalakis et al [6]. Further, we consider a partial information version of the problem. We then apply the results to convex programming and exhibit a simple algorithm for the approximate Max Flow problem.
This paper develops tools for welfare and revenue analyses of Bayes-Nash equilibria in asymmetric auctions with single-dimensional agents. We employ these tools to derive price of anarchy results for social … This paper develops tools for welfare and revenue analyses of Bayes-Nash equilibria in asymmetric auctions with single-dimensional agents. We employ these tools to derive price of anarchy results for social welfare and revenue. Our approach separates the standard smoothness framework [e.g., Syrgkanis and Tardos 2013] into two distinct parts. The first part, value covering, employs best-response analysis to individually relate each agent's expected price for allocation and welfare in any Bayes-Nash equilibrium. The second part, revenue covering, uses properties of an auction's rules and feasibility constraints to relate the revenue of the auction to the agents' expected prices for allocation (not necessarily in equilibrium). Because value covering holds for any equilibrium, proving an auction is revenue covered is a sufficient condition for approximating optimal welfare, and under the right conditions, the optimal revenue. In mechanisms with reserve prices, our welfare results show approximation with respect to the optimal mechanism with the same reserves.
Optimal mechanisms have been provided in quite general multi-item settings [Cai et al. 2012b, as long as each bidder's type distribution is given explicitly by listing every type in the … Optimal mechanisms have been provided in quite general multi-item settings [Cai et al. 2012b, as long as each bidder's type distribution is given explicitly by listing every type in the support along with its associated probability. In the implicit setting, e.g. when the bidders have additive valuations with independent and/or continuous values for the items, these results do not apply, and it was recently shown that exact revenue optimization is intractable, even when there is only one bidder [Daskalakis et al. 2013]. Even for item distributions with special structure, optimal mechanisms have been surprisingly rare [Manelli and Vincent 2006] and the problem is challenging even in the two-item case [Hart and Nisan 2012]. In this paper, we provide a framework for designing optimal mechanisms using optimal transport theory and duality theory. We instantiate our framework to obtain conditions under which only pricing the grand bundle is optimal in multi-item settings (complementing the work of [Manelli and Vincent 2006]), as well as to characterize optimal two-item mechanisms. We use our results to derive closed-form descriptions of the optimal mechanism in several two-item settings, exhibiting also a setting where a continuum of lotteries is necessary for revenue optimization but a closed-form representation of the mechanism can still be found efficiently using our framework.
We consider the problem of maximizing revenue for a monopolist offering multiple items to multiple heterogeneous buyers. We develop a simple mechanism that obtains a constant factor approximation under the … We consider the problem of maximizing revenue for a monopolist offering multiple items to multiple heterogeneous buyers. We develop a simple mechanism that obtains a constant factor approximation under the assumption that the buyers' values are additive subject to a matroid feasibility constraint and independent across items. Importantly, different buyers in our setting can have different constraints on the sets of items they desire. Our mechanism is a sequential variant of two-part tariffs. Prior to our work, simple approximation mechanisms for such multi-buyer problems were known only for the special cases of all unit-demand or all additive value buyers.
In this letter we briefly survey our recent work [Cai et al. 2016]. In it, we provide a new duality theory for Bayesian mechanism design which is quite general, and … In this letter we briefly survey our recent work [Cai et al. 2016]. In it, we provide a new duality theory for Bayesian mechanism design which is quite general, and applies for any objective the designer wishes to optimize, and for arbitrary agent valuations. We then apply our theory to auction design settings with many independent buyers who have independent values for many items, and are able to provide a unified proof of several recent exciting works on this front [Hart and Nisan 2012; Li and Yao 2013; Babaioff et al. 2014; Yao 2015; Chawla et al. 2007; Chawla et al. 2010; Chawla et al. 2015]. These works all show that simple mechanisms are approximately optimal in various settings. In some cases, our principled approach yields greatly improved approximation ratios as well.
We provide simple and approximately revenue-optimal mechanisms in the multi-item multi-bidder settings. We unify and improve all previous results, as well as generalize the results to broader cases. In particular, … We provide simple and approximately revenue-optimal mechanisms in the multi-item multi-bidder settings. We unify and improve all previous results, as well as generalize the results to broader cases. In particular, we prove that the better of the following two simple, deterministic and Dominant Strategy Incentive Compatible mechanisms, a sequential posted price mechanism or an anonymous sequential posted price mechanism with entry fee, achieves a constant fraction of the optimal revenue among all randomized, Bayesian Incentive Compatible mechanisms, when buyers' valuations are XOS over independent items. If the buyers' valuations are subadditive over independent items, the approximation factor degrades to O(logm), where m is the number of items. We obtain our results by first extending the Cai-Devanur-Weinberg duality framework to derive an effective benchmark of the optimal revenue for subadditive bidders, and then analyzing this upper bound with new techniques.
Artificial intelligence has seen several breakthroughs in recent years, with games often serving as milestones. A common feature of these games is that players have perfect information. Poker is the … Artificial intelligence has seen several breakthroughs in recent years, with games often serving as milestones. A common feature of these games is that players have perfect information. Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence. We introduce DeepStack, an algorithm for imperfect information settings. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning. In a study involving 44,000 hands of poker, DeepStack defeated with statistical significance professional poker players in heads-up no-limit Texas hold'em. The approach is theoretically sound and is shown to produce more difficult to exploit strategies than prior approaches.
We prove that there exists a constant ε > 0 such that, assuming the Exponential Time Hypothesis for PPAD, computing an ε-approximate Nash equilibrium in a two-player (n × n) … We prove that there exists a constant ε > 0 such that, assuming the Exponential Time Hypothesis for PPAD, computing an ε-approximate Nash equilibrium in a two-player (n × n) game requires quasi-polynomial time, nlog1-o(1) n. This matches (up to the o(1) term) the algorithm of Lipton, Markakis, and Mehta [54]. Our proof relies on a variety of techniques from the study of probabilistically checkable proofs (PCP), this is the first time that such ideas are used for a reduction between problems inside PPAD. En route, we also prove new hardness results for computing Nash equilibria in games with many players. In particular, we show that computing an ε-approximate Nash equilibrium in a game with n players requires 2Ω(n) oracle queries to the payoff tensors. This resolves an open problem posed by Hart and Nisan [43], Babichenko [13], and Chen et al. [28]. In fact, our results for n-player games are stronger: they hold with respect to the (ε,δ)-WeakNash relaxation recently introduced by Babichenko et al. [15].
A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content. A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.
A central problem in extremal graph theory is to estimate, for a given graph $H$, the number of $H$-free graphs on a given set of $n$ vertices. In the case … A central problem in extremal graph theory is to estimate, for a given graph $H$, the number of $H$-free graphs on a given set of $n$ vertices. In the case when $H$ is not bipartite, fairly precise estimates on this number are known. In particular, thirty years ago, Erd\H{o}s, Frankl, and R\odl proved that there are $2^{(1+o(1))\text{ex}(n,H)}$ such graphs. In the bipartite case, however, nontrivial bounds have been proven only for relatively few special graphs $H$. We make a first attempt at addressing this enumeration problem for a general bipartite graph $H$. We show that an upper bound of $2^{O(\text{ex}(n,H))}$ on the number of $H$-free graphs with $n$ vertices follows merely from a rather natural assumption on the growth rate of $n \mapsto \text{ex}(n,H)$; an analogous statement remains true when $H$ is a uniform hypergraph. Subsequently, we derive several new results, along with most previously known estimates, as simple corollaries of our theorem. At the heart of our proof lies a general supersaturation statement that extends the seminal work of Erd\H{o}s and Simonovits. The bounds on the number of $H$-free hypergraphs are derived from it using the method of hypergraph containers.
We address the issue of limit cycling behavior in training Generative Adversarial Networks and propose the use of Optimistic Mirror Decent (OMD) for training Wasserstein GANs. Recent theoretical results have … We address the issue of limit cycling behavior in training Generative Adversarial Networks and propose the use of Optimistic Mirror Decent (OMD) for training Wasserstein GANs. Recent theoretical results have shown that optimistic mirror decent (OMD) can enjoy faster regret rates in the context of zero-sum games. WGANs is exactly a context of solving a zero-sum game with simultaneous no-regret dynamics. Moreover, we show that optimistic mirror decent addresses the limit cycling problem in training WGANs. We formally show that in the case of bi-linear zero-sum games the last iterate of OMD dynamics converges to an equilibrium, in contrast to GD dynamics which are bound to cycle. We also portray the huge qualitative difference between GD and OMD dynamics with toy examples, even when GD is modified with many adaptations proposed in the recent literature, such as gradient penalty or momentum. We apply OMD WGAN training to a bioinformatics problem of generating DNA sequences. We observe that models trained with OMD achieve consistently smaller KL divergence with respect to the true underlying distribution, than models trained with GD variants. Finally, we introduce a new algorithm, Optimistic Adam, which is an optimistic variant of Adam. We apply it to WGAN training on CIFAR10 and observe improved performance in terms of inception score as compared to Adam.
Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form game solving. This was achieved by composing simplex regret minimizers … Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form game solving. This was achieved by composing simplex regret minimizers into an overall regret-minimization framework for extensive-form game strategy spaces. In this paper we study the general composability of regret minimizers. We derive a calculus for constructing regret minimizers for composite convex sets that are obtained from convexity-preserving operations on simpler convex sets. We show that local regret minimizers for the simpler sets can be combined with additional regret minimizers into an aggregate regret minimizer for the composite set. As one application, we show that the CFR framework can be constructed easily from our framework. We also show ways to include curtailing (constraining) operations into our framework. For one, they enables the construction of CFR generalization for extensive-form games with general convex strategy constraints that can cut across decision points.
This paper considers the design of non-truthful mechanisms from samples. We identify a parameterized family of mechanisms with strategically simple winner-pays-bid, all-pay, and truthful payment formats. In general (not necessarily … This paper considers the design of non-truthful mechanisms from samples. We identify a parameterized family of mechanisms with strategically simple winner-pays-bid, all-pay, and truthful payment formats. In general (not necessarily downward-closed) single-parameter feasibility environments we prove that the family has low representation and generalization error. Specifically, polynomially many bid samples suffice to identify and run a mechanism that is ε-close in Bayes-Nash equilibrium revenue or welfare to that of the optimal truthful mechanism.
Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in … Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the Thirties, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this survey, we focus on two extreme cases in which the analysis of regret is particularly simple and elegant: i.i.d. payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, we also analyze some of the most important variants and extensions, such as the contextual bandit model.
We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror Prox algorithm for offline optimization, prove … We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror Prox algorithm for offline optimization, prove an extension to Holder-smooth functions, and apply the results to saddle-point type problems. Next, we prove that a version of Optimistic Mirror Descent (which has a close relation to the Exponential Weights algorithm) can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T). This addresses a question of Daskalakis et al 2011. Further, we consider a partial information version of the problem. We then apply the results to convex programming and exhibit a simple algorithm for the approximate Max Flow problem.
We study the revenue maximization problem of a seller with n heterogeneous items for sale to a single buyer whose valuation function for sets of items is unknown and drawn … We study the revenue maximization problem of a seller with n heterogeneous items for sale to a single buyer whose valuation function for sets of items is unknown and drawn from some distribution D . We show that if D is a distribution over subadditive valuations with independent items, then the better of pricing each item separately or pricing only the grand bundle achieves a constant-factor approximation to the revenue of the optimal mechanism. This includes buyers who are k -demand, additive up to a matroid constraint, or additive up to constraints of any downward-closed set system (and whose values for the individual items are sampled independently), as well as buyers who are fractionally subadditive with item multipliers drawn independently. Our proof makes use of the core-tail decomposition framework developed in prior work showing similar results for the significantly simpler class of additive buyers. In the second part of the article, we develop a connection between approximately optimal simple mechanisms and approximate revenue monotonicity with respect to buyers’ valuations. Revenue non-monotonicity is the phenomenon that sometimes strictly increasing buyers’ values for every set can strictly decrease the revenue of the optimal mechanism. Using our main result, we derive a bound on how bad this degradation can be (and dub such a bound a proof of approximate revenue monotonicity); we further show that better bounds on approximate monotonicity imply a better analysis of our simple mechanisms.
For a constant ϵ, we prove a (N) lower bound on the (randomized) communication complexity of ϵ-Nash equilibrium in two-player N x N games. For n-player binary-action games we prove … For a constant ϵ, we prove a (N) lower bound on the (randomized) communication complexity of ϵ-Nash equilibrium in two-player N x N games. For n-player binary-action games we prove an exp(n) lower bound for the (randomized) communication complexity of (ϵ,ϵ)-weak approximate Nash equilibrium, which is a profile of mixed actions such that at least (1-ϵ)-fraction of the players are ϵ-best replying.
Abstract A central problem in extremal graph theory is to estimate, for a given graph H, the number of H-free graphs on a given set of n vertices. In the … Abstract A central problem in extremal graph theory is to estimate, for a given graph H, the number of H-free graphs on a given set of n vertices. In the case when H is not bipartite, Erd̋s, Frankl, and Rödl proved that there are 2(1+o(1))ex(n, H) such graphs. In the bipartite case, however, bounds of the form 2O(ex(n, H)) have been proven only for relatively few special graphs H. As a 1st attempt at addressing this problem in full generality, we show that such a bound follows merely from a rather natural assumption on the growth rate of n ↦ ex(n, H); an analogous statement remains true when H is a uniform hypergraph. Subsequently, we derive several new results, along with most previously known estimates, as simple corollaries of our theorem. At the heart of our proof lies a general supersaturation statement that extends the seminal work of Erd̋s and Simonovits. The bounds on the number of H-free hypergraphs are derived from it using the method of hypergraph containers.
We highlight recent progress in worst-case analysis of welfare in first price auctions. It was shown in [Syrgkanis and Tardos 2013] that in any Bayes-Nash equilibrium of a first-price auction, … We highlight recent progress in worst-case analysis of welfare in first price auctions. It was shown in [Syrgkanis and Tardos 2013] that in any Bayes-Nash equilibrium of a first-price auction, the expected social welfare is at least a (1 - 1/e) ≈ .63-fraction of optimal. This result uses smoothness, the standard technique for worst-case welfare analysis of games, and is tight if bidders' value distributions are permitted to be correlated. With independent distributions, however, the worst-known example, due to [Hartline et al. 2014], exhibits welfare that is a ≈ .89-fraction of optimal. This gap has persisted in spite of the canonical nature of the first-price auction and the prevalence of the independence assumption. In [Hoy et al. 2018], we improve the worst-case lower bound on first-price auction welfare assuming independently distributed values from (1 - 1/e) to ≈ .743. Notably, the proof of this result eschews smoothness in favor of techniques which exploit independence. This note overviews the new approach, and discusses research directions opened up by the result.
We provide algorithms that learn simple auctions whose revenue is approximately optimal in multi-item multi-bidder settings, for a wide range of bidder valuations including unit-demand, additive, constrained additive, XOS, and … We provide algorithms that learn simple auctions whose revenue is approximately optimal in multi-item multi-bidder settings, for a wide range of bidder valuations including unit-demand, additive, constrained additive, XOS, and subadditive. We obtain our learning results in two settings. The first is the commonly studied setting where sample access to the bidders' distributions over valuations is given, for both regular distributions and arbitrary distributions with bounded support. Here, our algorithms require polynomially many samples in the number of items and bidders. The second is a more general max-min learning setting that we introduce, where we are given "approximate distributions," and we seek to compute a mechanism whose revenue is approximately optimal simultaneously for all "true distributions" that are close to the ones we were given. These results are more general in that they imply the sample-based results, and are also applicable in settings where we have no sample access to the underlying distributions but have estimated them indirectly via market research or by observation of bidder behavior in previously run, potentially non-truthful auctions. All our results hold for valuation distributions satisfying the standard (and necessary) independence-across-items property. They also generalize and improve upon recent works of Goldner and Karlin [25] and Morgenstern and Roughgarden [32], which have provided algorithms that learn approximately optimal multi-item mechanisms in more restricted settings with additive, subadditive and unit-demand valuations using sample access to distributions. We generalize these results to the complete unit-demand, additive, and XOS setting, to i.i.d. subadditive bidders, and to the max-min setting. Our results are enabled by new uniform convergence bounds for hypotheses classes under product measures. Our bounds result in exponential savings in sample complexity compared to bounds derived by bounding the VC dimension and are of independent interest.
Regularized learning is a fundamental technique in online optimization, machine learning, and many other fields of computer science. A natural question that arises in this context is how regularized learning … Regularized learning is a fundamental technique in online optimization, machine learning, and many other fields of computer science. A natural question that arises in this context is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system's behavior is Poincare recurrent, implying that almost every trajectory revisits any (arbitrarily small) neighborhood of its starting point infinitely often. This cycling behavior is robust to the agents' choice of regularization mechanism (each agent could be using a different regularizer), to positive-affine transformations of the agents' utilities, and it also persists in the case of networked competition (zero-sum polymatrix games).
Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has … Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of OGDA-stable critical points is a superset of GDA-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective.
We establish that algorithmic experiments in zero-sum games fail miserably to confirm the unique, sharp prediction of maxmin equilibration. Contradicting nearly a century of economic thought that treats zero-sum games … We establish that algorithmic experiments in zero-sum games fail miserably to confirm the unique, sharp prediction of maxmin equilibration. Contradicting nearly a century of economic thought that treats zero-sum games nearly axiomatically as the exemplar symbol of economic stability, we prove that no meaningful prediction can be made about the day-to-day behavior of online learning dynamics in zero-sum games. Concretely, Multiplicative Weights Updates (MWU) with constant step-size is Lyapunov chaotic in the dual (payoff) space. Simply put, let's assume that an observer asks the agents playing Matching-Pennies whether they prefer Heads or Tails (and by how much in terms of aggregate payoff so far). The range of possible answers consistent with any arbitrary small set of initial conditions blows up exponentially with time everywhere in the payoff space. This result is robust both algorithmically as well as game theoretically: 1) Algorithmic robustness: Chaos is robust to agents using any of a general sub-family of Follow-the-Regularized-Leader (FTRL) algorithms, the well known regret-minimizing dynamics, even when agents mix-and-match dynamics, use different or slowly decreasing step-sizes. 2) Game theoretic robustness: Chaos is robust to all affine variants of zero-sum games (strictly competitive games), network variants with arbitrary large number of agents and even to competitive settings beyond these. Our result is in stark contrast with the time-average convergence of online learning to (approximate) Nash equilibrium, a result widely reported as (weak) convergence to equilibrium.
We provide a unified view of many recent developments in Bayesian mechanism design, including the black-box reductions of Cai, Daskalakis, and Weinberg [in Proceedings of the 54th Annual IEEE Symposium … We provide a unified view of many recent developments in Bayesian mechanism design, including the black-box reductions of Cai, Daskalakis, and Weinberg [in Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science, 2013], simple auctions for additive buyers [S. Hart and N. Nisan, in Proceedings of the 13th ACM Conference on Electronic Commerce, 2012], and posted-price mechanisms for unit-demand buyers [S. Chawla, J. D. Hartline, and R. D. Kleinberg, in Proceedings of the 8th ACM Conference on Electronic Commerce, 2007, pp. 243--251]. Additionally, we show that viewing these three previously disjoint lines of work through the same lens leads to new developments as well. First, we provide a duality framework for Bayesian mechanism design, which naturally accommodates multiple agents and arbitrary objectives/feasibility constraints. Using this, we prove that either a posted-price mechanism or the Vickrey--Clarke--Groves auction with per-bidder entry fees achieves a constant factor of the optimal revenue achievable by a Bayesian Incentive Compatible mechanism whenever buyers are unit-demand or additive, unifying previous breakthroughs of Chawla et al. [in Proceedings of the 42nd ACM Symposium on Theory of Computing, 2010] and Yao [in Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, 2015, pp. 92--109], and improving both approximation ratios (from 30 to 24 and 69 to 8, respectively). Finally, we show that this view also leads to improved structural characterizations in the framework of Cai, Daskalakis, and Weinberg.
In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show … In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate convergence of the so called "Optimistic Gradient Descent/Ascent" for the case of \textit{unconstrained} min-max optimization. Moreover, in \cite{Metal} the authors show that Mirror Descent with an extra gradient step displays last iterate convergence for convex-concave problems (both constrained and unconstrained), though their algorithm does not follow the online learning framework; it uses extra information rather than \textit{only} the history to compute the next iteration. In this work, we show that "Optimistic Multiplicative-Weights Update (OMWU)" which follows the no-regret online learning framework, exhibits last iterate convergence locally for convex-concave games, generalizing the results of \cite{DP19} where last iterate convergence of OMWU was shown only for the \textit{bilinear case}. We complement our results with experiments that indicate fast convergence of the method.
We consider the setting where players run the Hedge algorithm or its optimistic variant to play an $n$-action game repeatedly for $T$ rounds. 1) For two-player games, we show that … We consider the setting where players run the Hedge algorithm or its optimistic variant to play an $n$-action game repeatedly for $T$ rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at $\tilde{O}( 1/T ^{5/6} )$, improving the previous bound $O(1/T^{3/4})$ by Syrgkanis, Agarwal, Luo and Schapire (NIPS'15) 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than $\tildeΩ(1/ \sqrt{T})$, addressing an open question posted in Syrgkanis, Agarwal, Luo and Schapire (NIPS'15). For general m-player games, we show that the swap regret of each player decays at rate $\tilde{O}(m^{1/2} (n/T)^{3/4})$ when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour (JMLR'07). The algorithm can also be modified to achieve the same rate against itself and a rate of $\tilde{O}(\sqrt{n/T})$ against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.
Incentives are key to the success of crowdsourcing which heavily depends on the level of user participation. This paper designs an incentive mechanism to motivate a heterogeneous crowd of users … Incentives are key to the success of crowdsourcing which heavily depends on the level of user participation. This paper designs an incentive mechanism to motivate a heterogeneous crowd of users to actively participate in crowdsourcing campaigns. We cast the problem in a new, asymmetric all-pay contest model with incomplete information, where an arbitrary n of users exert irrevocable effort to compete for a prize tuple. The prize tuple is an array of prize functions as opposed to a single constant prize typically used by conventional contests. We design an optimal contest that (a) induces the maximum profit---total user effort minus the prize payout---for the crowdsourcer, and (b) ensures users to strictly have the incentive to participate. In stark contrast to intuition and prior related work, our mechanism induces an equilibrium in which heterogeneous users behave independently of one another as if they were in a homogeneous setting. This newly discovered property, which we coin as strategy autonomy (SA), is of practical significance: it (a) reduces computational and storage complexity by n-fold for each user, (b) increases the crowdsourcer's revenue by counteracting an effort reservation effect existing in asymmetric contests, and (c) neutralizes the (almost universal) law of diminishing marginal returns (DMR). Through an extensive numerical case study, we demonstrate and scrutinize the superior profitability of our mechanism, as well as draw insights into the SA property.
The existence of simple uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been … The existence of simple uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and presence of private information in the game, correlation in extensive-form games possesses significantly different properties than in normal-form games, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device (a.k.a. mediator) must take into account the evolution of beliefs of each player as they make observations throughout the game. Due to that significant added complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics that converge to the set of EFCEs in n -player general-sum extensive-form games with perfect recall. We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play after T game repetitions is proven to be a \( O(1/\sqrt {T}) \) -approximate EFCE with high probability, and an EFCE almost surely in the limit.
In this paper we propose a new approach for constructing efficient schemes for non-smooth convex optimization. It is based on a special smoothing technique, which can be applied to functions … In this paper we propose a new approach for constructing efficient schemes for non-smooth convex optimization. It is based on a special smoothing technique, which can be applied to functions with explicit max-structure. Our approach can be considered as an alternative to black-box minimization. From the viewpoint of efficiency estimates, we manage to improve the traditional bounds on the number of iterations of the gradient schemes from ** keeping basically the complexity of each iteration unchanged.
In this paper, we analyze the local convergence rate of optimistic mirror descent methods in stochastic variational inequalities, a class of optimization problems with important applications to learning theory and … In this paper, we analyze the local convergence rate of optimistic mirror descent methods in stochastic variational inequalities, a class of optimization problems with important applications to learning theory and machine learning. Our analysis reveals an intricate relation between the algorithm's rate of convergence and the local geometry induced by the method's underlying Bregman function. We quantify this relation by means of the Legendre exponent, a notion that we introduce to measure the growth rate of the Bregman divergence relative to the ambient norm near a solution. We show that this exponent determines both the optimal step-size policy of the algorithm and the optimal rates attained, explaining in this way the differences observed for some popular Bregman functions (Euclidean projection, negative entropy, fractional power, etc.).
Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a … Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely used Gradient Descent/Ascent procedure, called "Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstrained} convex-concave min-max optimization problems. We show that the same holds true in the more general problem of {\em constrained} min-max optimization under a variant of the no-regret Multiplicative-Weights-Update method called "Optimistic Multiplicative-Weights Update (OMWU)". This answers an open question of Syrgkanis et al \cite{SALS15}. The proof of our result requires fundamentally different techniques from those that exist in no-regret learning literature and the aforementioned papers. We show that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution. Inside that neighborhood we show that OMWU is locally (asymptotically) stable converging to the exact solution. We believe that our techniques will be useful in the analysis of the last iterate of other learning algorithms.
Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has … Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of \{OGDA\}-stable critical points is a superset of \{GDA\}-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective.
We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of … We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after $T$ rounds of interaction, each player experiences total regret that is ${\rm poly}(\log T)$. Our bound improves, exponentially, the $O({T}^{1/2})$ regret attainable by standard no-regret learners in games, the $O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the ${O}(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Pen, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of $\tilde{O}\left(\frac 1T\right)$.
We show that natural classes of regularized learning algorithms with a form of recency bias achieve faster convergence rates to approximate efficiency and to coarse correlated equilibria in multiplayer normal … We show that natural classes of regularized learning algorithms with a form of recency bias achieve faster convergence rates to approximate efficiency and to coarse correlated equilibria in multiplayer normal form games. When each player in a game uses an algorithm from our class, their individual regret decays at $O(T^{-3/4})$, while the sum of utilities converges to an approximate optimum at $O(T^{-1})$--an improvement upon the worst case $O(T^{-1/2})$ rates. We show a black-box reduction for any algorithm in the class to achieve $\tilde{O}(T^{-1/2})$ rates against an adversary, while maintaining the faster rates against algorithms in the class. Our results extend those of [Rakhlin and Shridharan 2013] and [Daskalakis et al. 2014], who only analyzed two-player zero-sum games for specific algorithms.
We study the revenue maximization problem of a seller with n heterogeneous items for sale to a single buyer whose valuation function for sets of items is unknown and drawn … We study the revenue maximization problem of a seller with n heterogeneous items for sale to a single buyer whose valuation function for sets of items is unknown and drawn from some distribution D. We show that if D is a distribution over subadditive valuations with independent items, then the better of pricing each item separately or pricing only the grand bundle achieves a constant-factor approximation to the revenue of the optimal mechanism. This includes buyers who are k-demand, additive up to a matroid constraint, or additive up to constraints of any downwards-closed set system (and whose values for the individual items are sampled independently), as well as buyers who are fractionally subadditive with item multipliers drawn independently. Our proof makes use of the core-tail decomposition framework developed in prior work showing similar results for the significantly simpler class of additive buyers [Li and Yao 2013; Babaioff et al.2014].
In this letter we briefly survey our main result from [Babaioff el al. 2014]: a simple and approximately revenue-optimal mechanism for a monopolist who wants to sell a variety of … In this letter we briefly survey our main result from [Babaioff el al. 2014]: a simple and approximately revenue-optimal mechanism for a monopolist who wants to sell a variety of items to a single buyer with an additive valuation.
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is … A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the next coin. Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the 1930s, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this book, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model. This monograph is an ideal reference for students and researchers with an interest in bandit problems.