Author Description

Login to generate an author description

Ask a Question About This Mathematician

Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. … Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines aspects of stochastic integer programming with online learning. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems. We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel "balanced exploration" paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for … We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for … We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2016 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular FunctionsAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, … Previous chapter Next chapter Full AccessProceedings Proceedings of the 2016 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular FunctionsAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, Lior Seeman, and Yaron SingerAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, Lior Seeman, and Yaron Singerpp.414 - 429Chapter DOI:https://doi.org/10.1137/1.9781611974331.ch31PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a (1 – 1/e)2-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call locally-adaptive policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the (1–1/e)2-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most k that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from Planted-Clique that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of locally-adaptive policies we use in the main result. Previous chapter Next chapter RelatedDetails Published:2016eISBN:978-1-61197-433-1 https://doi.org/10.1137/1.9781611974331Book Series Name:ProceedingsBook Code:PRDA16Book Pages:viii + 2106
Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform – partially hiding … Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform – partially hiding some information could be more beneficial for the platform's revenue. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders' valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on developing algorithms whose running time is polynomial in the number of bidders, but is independent of the number of ad opportunity realizations.We assume that the auctioneer can commit to a signaling scheme to reveal noisy information about the realized state of the ad opportunity, and examine the auctioneer's algorithmic question of designing the optimal signaling scheme. We first consider that the auctioneer is restricted to send a public signal to all bidders. As a warm-up, we start with a basic (though less realistic) setting in which the auctioneer knows the bidders' valuations, and show that an e-optimal scheme can be implemented in time polynomial in the number of bidders and 1/∊. We then move to a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two results. First, we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals. Second, we present a "simple" public signaling scheme that serves as a constant approximation under mild assumptions.Finally, we initiate an exploration on the power of being able to send different signals privately to different bidders. In the basic setting where the auctioneer knows bidders' valuations, we exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a (1-ε)-approximation to the maximum-cardinality union of k sets, in running time $O(f(ε,k,d)⋅ poly(n)) where n is the problem size, d is the VC-dimension of the set system, and f(ε,k,d) is exponential in (kd/ε)c for some constant c. We complement this positive result by showing that the function f(ε,k,d) in the running-time bound cannot be replaced by a function depending only on (ε,d) or on (k,d), under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of 1-1/e that the greedy algorithm achieves on arbitrary instances of maximum coverage.
We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci … We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci and the buyer is interested in a set S that maximizes v(S) subject to ∑i∈Sci ≤ β. Special cases of combinatorial procurement auctions are well-studied problems from submodular optimization. In particular, when the costs are all equal (cardinality constraint), a classic result by Nemhauser et al shows that the greedy algorithm provides an e/e-1 approximation.
Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to … Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability.
In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular … In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $\epsilon$ error it is guaranteed that we have found a feasible set with a $2(k+1+\epsilon)$-approximation factor which can indeed be further improved to $(k+1+\epsilon)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.
Motivated by online decision making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable … Motivated by online decision making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have [Formula: see text] (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a [Formula: see text] (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first-order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances. This paper was accepted by George Shanthikumar, data science. Funding: R. Niazadeh was supported by research funding from University of Chicago Booth School of Business. N. Golrezaei was supported in part by the Young Investigator Program Award from the Office of Naval Research [N00014-21-1-2776] and the MIT Research Support Award. Supplemental Material: The Online Appendix is available at https://doi.org/10.1287/mnsc.2022.4558
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, … Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a $(1-1/e-\varepsilon)$ approximation guarantee, in expectation, to the optimum solution in time linear in the size of the data and independent of the cardinality constraint. We empirically demonstrate the effectiveness of our algorithm on submodular functions arising in data summarization, including training large-scale kernel methods, exemplar-based clustering, and sensor placement. We observe that STOCHASTIC-GREEDY practically achieves the same utility value as lazy greedy but runs much faster. More surprisingly, we observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate the whole fraction of data points even once and still achieves indistinguishable results compared to lazy greedy.
We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ … We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $\Sigma_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the costs are all equal (\emph{cardinality constraint}), a classic result by Nemhauser et al shows that the greedy algorithm provides an $\frac e {e-1}$ approximation. Motivated by many papers that utilize demand queries to elicit the preferences of agents in economic settings, we develop algorithms that guarantee improved approximation ratios in the presence of demand oracles. We are able to break the $\frac e {e-1}$ barrier: we present algorithms that use only polynomially many demand queries and have approximation ratios of $\frac 9 8+\epsilon$ for the general problem and $\frac 9 8$ for maximization subject to a cardinality constraint. We also consider the more general class of subadditive valuations. We present algorithms that obtain an approximation ratio of $2+\epsilon$ for the general problem and 2 for maximization subject to a cardinality constraint. We guarantee these approximation ratios even when the valuations are non-monotone. We show that these ratios are essentially optimal, in the sense that for any constant $\epsilon>0$, obtaining an approximation ratio of $2-\epsilon$ requires exponentially many demand queries.
In this work, we investigate the problem of how to bid in repeated contextual first price auctions for a single learner (the bidder). Concretely, at each time t, the learner … In this work, we investigate the problem of how to bid in repeated contextual first price auctions for a single learner (the bidder). Concretely, at each time t, the learner receives a context and decides the bid based on historical information and xt. We assume that the maximum bid of all the others follows a linear model, i.e., mt = ⟨α0, xt⟩ + zt, where is unknown to the learner and zt is randomly sampled from a noise distribution with log-concave density. In this work, we consider both binary feedback (the learner can only observe whether she wins or not) and full information feedback (the learner can observe mt) models of the learner. For binary feedback, when the noise distribution is (partially) known, we propose a bidding algorithm that achieves at most regret, where Δ is a constant reserve in first price auctions.1 For the full information feedback with unknown noise distribution, we provide an algorithm that achieves regret at most . Our approach combines an estimator for log-concave density functions and the Maximum Likelihood Estimation (MLE) method to learn the noise distribution and linear weight α0 simultaneously. We complement our results with a lower bound result such that any bidding policy in a broad class must achieve regret at least , even when the learner receives the full information feedback and is known.
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen … In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and $f(\eps,k,d)$ is exponential in $(kd/\eps)^c$ for some constant $c$. We complement this positive result by showing that the function $f(\eps,k,d)$ in the running-time bound cannot be replaced by a function depending only on $(\eps,d)$ or on $(k,d)$, under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of $1-1/e$ that the greedy algorithm achieves on arbitrary instances of maximum coverage.
Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the … Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most $f(T)$ within the first $T$ rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log(T))$, where $k$ is the number of arms.
Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the … Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspects is handling delayed feedback with respect to conversions, which can happen multiple times with varying delay. This task is difficult, as the delay distribution is different for each advertiser, is long-tailed, often does not follow any particular class of parametric distributions, and can change over time. We tackle these challenges using an unbiased estimation model based on three core ideas. The first idea is to split the label as a sum of labels with different delay buckets, each of which trains only on mature label, the second is to use thermometer encoding to increase accuracy and reduce inference cost, and the third is to use auxiliary information to increase the stability of the model and to handle drift in the distribution.
Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object … Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most $\widetilde{O}(H^2\sqrt{T})$, which depends on the number of rounds $H$ and number of episodes $T$, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is \emph{mixed} and \emph{delayed}. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest.
Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems … Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up information pertinent to the user's reward after recommendation (e.g., how long the user stayed, what is the user's watch speed, etc.). To improve online learning efficiency in these applications, we study a novel contextual bandit problem with post-serving contexts and design a new algorithm, poLinUCB, that achieves tight regret under standard assumptions. Core to our technical proof is a robustified and generalized version of the well-known Elliptical Potential Lemma (EPL), which can accommodate noise in data. Such robustification is necessary for tackling our problem, and we believe it could also be of general interest. Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches.
Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, … Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders' valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on developing algorithms whose running time is independent of the number of ad opportunity realizations. We examine the auctioneer's algorithmic question of designing the optimal signaling scheme. When the auctioneer is restricted to send a public signal to all bidders, we focus on a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two main results: 1. we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals; 2. we present a "simple" public signaling scheme that serves as a constant approximation under mild assumptions. We then initiate an exploration on the power of being able to send different signals privately to different bidders. Here we examine a basic setting where the auctioneer knows bidders' valuations, and exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue.
In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular … In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $\epsilon$ error it is guaranteed that we have found a feasible set with a $2(k+1+\epsilon)$-approximation factor which can indeed be further improved to $(k+1+\epsilon)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.
We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the … We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the bidder may want to conceal or partially reveal, but importantly, not to misreport. We refer to bidders with such behaviors as strategically reticent bidders. Among others, one direct motivation of our model is the ad auction in which many ad platforms today elicit from each bidder not only their private value per conversion but also their private information about Internet users (e.g., user activities on the advertiser's websites) in order to improve the platform's estimation of conversion rates. We show that in this new setup, it is still possible to design mechanisms that are both Incentive and Information Compatible (IIC). We develop two different black-box transformations, which convert any mechanism $\mathcal{M}$ for classic bidders to a mechanism $\bar{\mathcal{M}}$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $\mathcal{M}$ under which the transformation leads to IIC mechanisms $\bar{\mathcal{M}}$. Interestingly, as corollaries of these results, we show that running VCG with bidders' expected values maximizes welfare, whereas the mechanism using expected outcome of Myerson's auction maximizes revenue. Finally, we study how regulation on the auctioneer's usage of information can lead to more robust mechanisms.
Classic mechanism design often assumes that a bidder's action is restricted to report a type or a signal, possibly untruthfully. In today's digital economy, bidders are holding increasing amount of … Classic mechanism design often assumes that a bidder's action is restricted to report a type or a signal, possibly untruthfully. In today's digital economy, bidders are holding increasing amount of private information about the auctioned items. And due to legal or ethical concerns, they would demand to reveal partial but truthful information, as opposed to report untrue signal or misinformation. To accommodate such bidder behaviors in auction design, we propose and study a novel mechanism design setup where each bidder holds two kinds of information: (1) private \emph{value type}, which can be misreported; (2) private \emph{information variable}, which the bidder may want to conceal or partially reveal, but importantly, \emph{not} to misreport. We show that in this new setup, it is still possible to design mechanisms that are both \emph{Incentive and Information Compatible} (IIC). We develop two different black-box transformations, which convert any mechanism $\mathcal{M}$ for classic bidders to a mechanism $\mathcal{M}'$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $\mathcal{M}$ under which the transformation leads to IIC mechanisms $\mathcal{M}'$. Interestingly, as corollaries of these results, we show that running VCG with expected bidder values maximizes welfare whereas the mechanism using expected outcome of Myerson's auction maximizes revenue. Finally, we study how regulation on the auctioneer's usage of information may lead to more robust mechanisms.
Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the … Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most $f(T)$ within the first $T$ rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log(T))$, where $k$ is the number of arms.
In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price … In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = \alpha_0\cdot x_t + z_t$, where $\alpha_0\in \mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $\mathcal{F}$ with log-concave density function $f$. We consider both \emph{binary feedback} (the learner can only observe whether she wins or not) and \emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $\mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $\widetilde{O}(\sqrt{\log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with \emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $\widetilde{O}(\sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $\mathcal{F}$ and linear weight $\alpha_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $\Omega(\sqrt{T})$, even when the learner receives the full information feedback and $\mathcal{F}$ is known.
Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to … Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have $O(\sqrt{T})$ (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances.
We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ … We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $\Sigma_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the costs are all equal (\emph{cardinality constraint}), a classic result by Nemhauser et al shows that the greedy algorithm provides an $\frac e {e-1}$ approximation. Motivated by many papers that utilize demand queries to elicit the preferences of agents in economic settings, we develop algorithms that guarantee improved approximation ratios in the presence of demand oracles. We are able to break the $\frac e {e-1}$ barrier: we present algorithms that use only polynomially many demand queries and have approximation ratios of $\frac 9 8+\epsilon$ for the general problem and $\frac 9 8$ for maximization subject to a cardinality constraint. We also consider the more general class of subadditive valuations. We present algorithms that obtain an approximation ratio of $2+\epsilon$ for the general problem and 2 for maximization subject to a cardinality constraint. We guarantee these approximation ratios even when the valuations are non-monotone. We show that these ratios are essentially optimal, in the sense that for any constant $\epsilon>0$, obtaining an approximation ratio of $2-\epsilon$ requires exponentially many demand queries.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and $f(\eps,k,d)$ is exponential in $(kd/\eps)^c$ for some constant $c$. We complement this positive result by showing that the function $f(\eps,k,d)$ in the running-time bound cannot be replaced by a function depending only on $(\eps,d)$ or on $(k,d)$, under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of $1-1/e$ that the greedy algorithm achieves on arbitrary instances of maximum coverage.
The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, … The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a $(1-1/e)^2$-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call \emph{locally-adaptive} policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the $(1-1/e)^2$-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most $k$ that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from \textsc{Planted-Clique} that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of \emph{locally-adaptive} policies we use in the main result.
We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance … We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better label randomizers depending on a privately estimated prior distribution over the labels. We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets, highlighting the importance of reducing bias when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal unbiased randomizers.
Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive … Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive responses. In an aggregate learning framework, the dataset is grouped into bags of samples, where each bag is available only with an aggregate response, providing a summary of individuals' responses in that bag. In this paper, we study two natural loss functions for learning from aggregate responses: bag-level loss and the instance-level loss. In the former, the model is learnt by minimizing a loss between aggregate responses and aggregate model predictions, while in the latter the model aims to fit individual predictions to the aggregate responses. In this work, we show that the instance-level loss can be perceived as a regularized form of the bag-level loss. This observation lets us compare the two approaches with respect to bias and variance of the resulting estimators, and introduce a novel interpolating estimator which combines the two approaches. For linear regression tasks, we provide a precise characterization of the risk of the interpolating estimator in an asymptotic regime where the size of the training set grows in proportion to the features dimension. Our analysis allows us to theoretically understand the effect of different factors, such as bag size on the model prediction risk. In addition, we propose a mechanism for differentially private learning from aggregate responses and derive the optimal bag size in terms of prediction risk-privacy trade-off. We also carry out thorough experiments to corroborate our theory and show the efficacy of the interpolating estimator.
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen … In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand). This problem captures, for … Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand). This problem captures, for example, situations where a merchant and a brand cooperatively bid in an auction to advertise a product, and both benefit from the ad being shown. A mechanism collects bids from the two and decides whether to allocate and which payments the two parties should make. This gives rise to intricate incentive compatibility constraints, e.g., on how to split payments between the two parties. We approach the problem of finding a revenue-maximizing incentive-compatible mechanism from an online learning perspective; this poses significant technical challenges. First, the action space (the class of all possible mechanisms) is huge; second, the function that maps mechanisms to revenue is highly irregular, ruling out standard discretization-based approaches. In the stochastic setting, we design an efficient learning algorithm achieving a regret bound of $O(T^{3/4})$. Our approach is based on an adaptive discretization scheme of the space of mechanisms, as any non-adaptive discretization fails to achieve sublinear regret. In the adversarial setting, we exploit the non-Lipschitzness of the problem to prove a strong negative result, namely that no learning algorithm can achieve more than half of the revenue of the best fixed mechanism in hindsight. We then consider the $\sigma$-smooth adversary; we construct an efficient learning algorithm that achieves a regret bound of $O(T^{2/3})$ and builds on a succinct encoding of exponentially many experts. Finally, we prove that no learning algorithm can achieve less than $\Omega(\sqrt T)$ regret in both the stochastic and the smooth setting, thus narrowing the range where the minimax regret rates for these two problems lie.
Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand). This problem captures, for … Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand). This problem captures, for example, situations where a merchant and a brand cooperatively bid in an auction to advertise a product, and both benefit from the ad being shown. A mechanism collects bids from the two and decides whether to allocate and which payments the two parties should make. This gives rise to intricate incentive compatibility constraints, e.g., on how to split payments between the two parties. We approach the problem of finding a revenue-maximizing incentive-compatible mechanism from an online learning perspective; this poses significant technical challenges. First, the action space (the class of all possible mechanisms) is huge; second, the function that maps mechanisms to revenue is highly irregular, ruling out standard discretization-based approaches. In the stochastic setting, we design an efficient learning algorithm achieving a regret bound of $O(T^{3/4})$. Our approach is based on an adaptive discretization scheme of the space of mechanisms, as any non-adaptive discretization fails to achieve sublinear regret. In the adversarial setting, we exploit the non-Lipschitzness of the problem to prove a strong negative result, namely that no learning algorithm can achieve more than half of the revenue of the best fixed mechanism in hindsight. We then consider the $\sigma$-smooth adversary; we construct an efficient learning algorithm that achieves a regret bound of $O(T^{2/3})$ and builds on a succinct encoding of exponentially many experts. Finally, we prove that no learning algorithm can achieve less than $\Omega(\sqrt T)$ regret in both the stochastic and the smooth setting, thus narrowing the range where the minimax regret rates for these two problems lie.
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen … In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen … In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design.
Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive … Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive responses. In an aggregate learning framework, the dataset is grouped into bags of samples, where each bag is available only with an aggregate response, providing a summary of individuals' responses in that bag. In this paper, we study two natural loss functions for learning from aggregate responses: bag-level loss and the instance-level loss. In the former, the model is learnt by minimizing a loss between aggregate responses and aggregate model predictions, while in the latter the model aims to fit individual predictions to the aggregate responses. In this work, we show that the instance-level loss can be perceived as a regularized form of the bag-level loss. This observation lets us compare the two approaches with respect to bias and variance of the resulting estimators, and introduce a novel interpolating estimator which combines the two approaches. For linear regression tasks, we provide a precise characterization of the risk of the interpolating estimator in an asymptotic regime where the size of the training set grows in proportion to the features dimension. Our analysis allows us to theoretically understand the effect of different factors, such as bag size on the model prediction risk. In addition, we propose a mechanism for differentially private learning from aggregate responses and derive the optimal bag size in terms of prediction risk-privacy trade-off. We also carry out thorough experiments to corroborate our theory and show the efficacy of the interpolating estimator.
In this work, we investigate the problem of how to bid in repeated contextual first price auctions for a single learner (the bidder). Concretely, at each time t, the learner … In this work, we investigate the problem of how to bid in repeated contextual first price auctions for a single learner (the bidder). Concretely, at each time t, the learner receives a context and decides the bid based on historical information and xt. We assume that the maximum bid of all the others follows a linear model, i.e., mt = ⟨α0, xt⟩ + zt, where is unknown to the learner and zt is randomly sampled from a noise distribution with log-concave density. In this work, we consider both binary feedback (the learner can only observe whether she wins or not) and full information feedback (the learner can observe mt) models of the learner. For binary feedback, when the noise distribution is (partially) known, we propose a bidding algorithm that achieves at most regret, where Δ is a constant reserve in first price auctions.1 For the full information feedback with unknown noise distribution, we provide an algorithm that achieves regret at most . Our approach combines an estimator for log-concave density functions and the Maximum Likelihood Estimation (MLE) method to learn the noise distribution and linear weight α0 simultaneously. We complement our results with a lower bound result such that any bidding policy in a broad class must achieve regret at least , even when the learner receives the full information feedback and is known.
Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems … Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up information pertinent to the user's reward after recommendation (e.g., how long the user stayed, what is the user's watch speed, etc.). To improve online learning efficiency in these applications, we study a novel contextual bandit problem with post-serving contexts and design a new algorithm, poLinUCB, that achieves tight regret under standard assumptions. Core to our technical proof is a robustified and generalized version of the well-known Elliptical Potential Lemma (EPL), which can accommodate noise in data. Such robustification is necessary for tackling our problem, and we believe it could also be of general interest. Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches.
We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance … We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better label randomizers depending on a privately estimated prior distribution over the labels. We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets, highlighting the importance of reducing bias when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal unbiased randomizers.
Motivated by online decision making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable … Motivated by online decision making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have [Formula: see text] (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a [Formula: see text] (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first-order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances. This paper was accepted by George Shanthikumar, data science. Funding: R. Niazadeh was supported by research funding from University of Chicago Booth School of Business. N. Golrezaei was supported in part by the Young Investigator Program Award from the Office of Naval Research [N00014-21-1-2776] and the MIT Research Support Award. Supplemental Material: The Online Appendix is available at https://doi.org/10.1287/mnsc.2022.4558
Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object … Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most $\widetilde{O}(H^2\sqrt{T})$, which depends on the number of rounds $H$ and number of episodes $T$, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is \emph{mixed} and \emph{delayed}. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest.
Classic mechanism design often assumes that a bidder's action is restricted to report a type or a signal, possibly untruthfully. In today's digital economy, bidders are holding increasing amount of … Classic mechanism design often assumes that a bidder's action is restricted to report a type or a signal, possibly untruthfully. In today's digital economy, bidders are holding increasing amount of private information about the auctioned items. And due to legal or ethical concerns, they would demand to reveal partial but truthful information, as opposed to report untrue signal or misinformation. To accommodate such bidder behaviors in auction design, we propose and study a novel mechanism design setup where each bidder holds two kinds of information: (1) private \emph{value type}, which can be misreported; (2) private \emph{information variable}, which the bidder may want to conceal or partially reveal, but importantly, \emph{not} to misreport. We show that in this new setup, it is still possible to design mechanisms that are both \emph{Incentive and Information Compatible} (IIC). We develop two different black-box transformations, which convert any mechanism $\mathcal{M}$ for classic bidders to a mechanism $\mathcal{M}'$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $\mathcal{M}$ under which the transformation leads to IIC mechanisms $\mathcal{M}'$. Interestingly, as corollaries of these results, we show that running VCG with expected bidder values maximizes welfare whereas the mechanism using expected outcome of Myerson's auction maximizes revenue. Finally, we study how regulation on the auctioneer's usage of information may lead to more robust mechanisms.
Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to … Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability.
Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the … Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspects is handling delayed feedback with respect to conversions, which can happen multiple times with varying delay. This task is difficult, as the delay distribution is different for each advertiser, is long-tailed, often does not follow any particular class of parametric distributions, and can change over time. We tackle these challenges using an unbiased estimation model based on three core ideas. The first idea is to split the label as a sum of labels with different delay buckets, each of which trains only on mature label, the second is to use thermometer encoding to increase accuracy and reduce inference cost, and the third is to use auxiliary information to increase the stability of the model and to handle drift in the distribution.
We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the … We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the bidder may want to conceal or partially reveal, but importantly, not to misreport. We refer to bidders with such behaviors as strategically reticent bidders. Among others, one direct motivation of our model is the ad auction in which many ad platforms today elicit from each bidder not only their private value per conversion but also their private information about Internet users (e.g., user activities on the advertiser's websites) in order to improve the platform's estimation of conversion rates. We show that in this new setup, it is still possible to design mechanisms that are both Incentive and Information Compatible (IIC). We develop two different black-box transformations, which convert any mechanism $\mathcal{M}$ for classic bidders to a mechanism $\bar{\mathcal{M}}$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $\mathcal{M}$ under which the transformation leads to IIC mechanisms $\bar{\mathcal{M}}$. Interestingly, as corollaries of these results, we show that running VCG with bidders' expected values maximizes welfare, whereas the mechanism using expected outcome of Myerson's auction maximizes revenue. Finally, we study how regulation on the auctioneer's usage of information can lead to more robust mechanisms.
In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price … In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = \alpha_0\cdot x_t + z_t$, where $\alpha_0\in \mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $\mathcal{F}$ with log-concave density function $f$. We consider both \emph{binary feedback} (the learner can only observe whether she wins or not) and \emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $\mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $\widetilde{O}(\sqrt{\log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with \emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $\widetilde{O}(\sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $\mathcal{F}$ and linear weight $\alpha_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $\Omega(\sqrt{T})$, even when the learner receives the full information feedback and $\mathcal{F}$ is known.
Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to … Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have $O(\sqrt{T})$ (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances.
In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular … In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $\epsilon$ error it is guaranteed that we have found a feasible set with a $2(k+1+\epsilon)$-approximation factor which can indeed be further improved to $(k+1+\epsilon)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.
In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular … In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $\epsilon$ error it is guaranteed that we have found a feasible set with a $2(k+1+\epsilon)$-approximation factor which can indeed be further improved to $(k+1+\epsilon)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem.
Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the … Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most $f(T)$ within the first $T$ rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log(T))$, where $k$ is the number of arms.
Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the … Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most $f(T)$ within the first $T$ rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log(T))$, where $k$ is the number of arms.
Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform – partially hiding … Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform – partially hiding some information could be more beneficial for the platform's revenue. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders' valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on developing algorithms whose running time is polynomial in the number of bidders, but is independent of the number of ad opportunity realizations.We assume that the auctioneer can commit to a signaling scheme to reveal noisy information about the realized state of the ad opportunity, and examine the auctioneer's algorithmic question of designing the optimal signaling scheme. We first consider that the auctioneer is restricted to send a public signal to all bidders. As a warm-up, we start with a basic (though less realistic) setting in which the auctioneer knows the bidders' valuations, and show that an e-optimal scheme can be implemented in time polynomial in the number of bidders and 1/∊. We then move to a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two results. First, we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals. Second, we present a "simple" public signaling scheme that serves as a constant approximation under mild assumptions.Finally, we initiate an exploration on the power of being able to send different signals privately to different bidders. In the basic setting where the auctioneer knows bidders' valuations, we exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue.
Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, … Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders' valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on developing algorithms whose running time is independent of the number of ad opportunity realizations. We examine the auctioneer's algorithmic question of designing the optimal signaling scheme. When the auctioneer is restricted to send a public signal to all bidders, we focus on a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two main results: 1. we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals; 2. we present a "simple" public signaling scheme that serves as a constant approximation under mild assumptions. We then initiate an exploration on the power of being able to send different signals privately to different bidders. Here we examine a basic setting where the auctioneer knows bidders' valuations, and exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2016 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular FunctionsAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, … Previous chapter Next chapter Full AccessProceedings Proceedings of the 2016 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular FunctionsAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, Lior Seeman, and Yaron SingerAshwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, Lior Seeman, and Yaron Singerpp.414 - 429Chapter DOI:https://doi.org/10.1137/1.9781611974331.ch31PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a (1 – 1/e)2-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call locally-adaptive policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the (1–1/e)2-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most k that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from Planted-Clique that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of locally-adaptive policies we use in the main result. Previous chapter Next chapter RelatedDetails Published:2016eISBN:978-1-61197-433-1 https://doi.org/10.1137/1.9781611974331Book Series Name:ProceedingsBook Code:PRDA16Book Pages:viii + 2106
The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, … The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a $(1-1/e)^2$-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call \emph{locally-adaptive} policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the $(1-1/e)^2$-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most $k$ that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from \textsc{Planted-Clique} that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of \emph{locally-adaptive} policies we use in the main result.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for … We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties.
We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for … We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties.
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, … Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a $(1-1/e-\varepsilon)$ approximation guarantee, in expectation, to the optimum solution in time linear in the size of the data and independent of the cardinality constraint. We empirically demonstrate the effectiveness of our algorithm on submodular functions arising in data summarization, including training large-scale kernel methods, exemplar-based clustering, and sensor placement. We observe that STOCHASTIC-GREEDY practically achieves the same utility value as lazy greedy but runs much faster. More surprisingly, we observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate the whole fraction of data points even once and still achieves indistinguishable results compared to lazy greedy.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. … Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines aspects of stochastic integer programming with online learning. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems. We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel "balanced exploration" paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a (1-ε)-approximation to the maximum-cardinality union of k sets, in running time $O(f(ε,k,d)⋅ poly(n)) where n is the problem size, d is the VC-dimension of the set system, and f(ε,k,d) is exponential in (kd/ε)c for some constant c. We complement this positive result by showing that the function f(ε,k,d) in the running-time bound cannot be replaced by a function depending only on (ε,d) or on (k,d), under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of 1-1/e that the greedy algorithm achieves on arbitrary instances of maximum coverage.
We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci … We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci and the buyer is interested in a set S that maximizes v(S) subject to ∑i∈Sci ≤ β. Special cases of combinatorial procurement auctions are well-studied problems from submodular optimization. In particular, when the costs are all equal (cardinality constraint), a classic result by Nemhauser et al shows that the greedy algorithm provides an e/e-1 approximation.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and $f(\eps,k,d)$ is exponential in $(kd/\eps)^c$ for some constant $c$. We complement this positive result by showing that the function $f(\eps,k,d)$ in the running-time bound cannot be replaced by a function depending only on $(\eps,d)$ or on $(k,d)$, under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of $1-1/e$ that the greedy algorithm achieves on arbitrary instances of maximum coverage.
We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ … We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $\Sigma_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the costs are all equal (\emph{cardinality constraint}), a classic result by Nemhauser et al shows that the greedy algorithm provides an $\frac e {e-1}$ approximation. Motivated by many papers that utilize demand queries to elicit the preferences of agents in economic settings, we develop algorithms that guarantee improved approximation ratios in the presence of demand oracles. We are able to break the $\frac e {e-1}$ barrier: we present algorithms that use only polynomially many demand queries and have approximation ratios of $\frac 9 8+\epsilon$ for the general problem and $\frac 9 8$ for maximization subject to a cardinality constraint. We also consider the more general class of subadditive valuations. We present algorithms that obtain an approximation ratio of $2+\epsilon$ for the general problem and 2 for maximization subject to a cardinality constraint. We guarantee these approximation ratios even when the valuations are non-monotone. We show that these ratios are essentially optimal, in the sense that for any constant $\epsilon>0$, obtaining an approximation ratio of $2-\epsilon$ requires exponentially many demand queries.
We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ … We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $\Sigma_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the costs are all equal (\emph{cardinality constraint}), a classic result by Nemhauser et al shows that the greedy algorithm provides an $\frac e {e-1}$ approximation. Motivated by many papers that utilize demand queries to elicit the preferences of agents in economic settings, we develop algorithms that guarantee improved approximation ratios in the presence of demand oracles. We are able to break the $\frac e {e-1}$ barrier: we present algorithms that use only polynomially many demand queries and have approximation ratios of $\frac 9 8+\epsilon$ for the general problem and $\frac 9 8$ for maximization subject to a cardinality constraint. We also consider the more general class of subadditive valuations. We present algorithms that obtain an approximation ratio of $2+\epsilon$ for the general problem and 2 for maximization subject to a cardinality constraint. We guarantee these approximation ratios even when the valuations are non-monotone. We show that these ratios are essentially optimal, in the sense that for any constant $\epsilon>0$, obtaining an approximation ratio of $2-\epsilon$ requires exponentially many demand queries.
We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a … We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and $f(\eps,k,d)$ is exponential in $(kd/\eps)^c$ for some constant $c$. We complement this positive result by showing that the function $f(\eps,k,d)$ in the running-time bound cannot be replaced by a function depending only on $(\eps,d)$ or on $(k,d)$, under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of $1-1/e$ that the greedy algorithm achieves on arbitrary instances of maximum coverage.
Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, … Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, and maximum facility location problems. Unlike submodular minimization, submodular maximization is NP-hard. In this paper, we give the first constant-factor approximation algorithm for maximizing any non-negative submodular function subject to multiple matroid or knapsack constraints. We emphasize that our results are for non-monotone submodular functions. In particular, for any constant k, we present a (1/k+2+1/k+ε)-approximation for the submodular maximization problem under k matroid constraints, and a (1/5-ε)-approximation algorithm for this problem subject to k knapsack constraints (ε>0 is any constant). We improve the approximation guarantee of our algorithm to 1/k+1+{1/k-1}+ε for k≥2 partition matroid constraints. This idea also gives a ({1/k+ε)-approximation for maximizing a monotone submodular function subject to k≥2 partition matroids, which improves over the previously best known guarantee of 1/k+1.
We study a novel class of mechanism design problems in which the outcomes are constrained by the payments. This basic class of mechanism design problems captures many common economic situations, … We study a novel class of mechanism design problems in which the outcomes are constrained by the payments. This basic class of mechanism design problems captures many common economic situations, and yet it has not been studied, to our knowledge, in the past. We focus on the case of procurement auctions in which sellers have private costs, and the auctioneer aims to maximize a utility function on subsets of items, under the constraint that the sum of the payments provided by the mechanism does not exceed a given budget. Standard mechanism design ideas such as the VCG mechanism and its variants are not applicable here. We show that, for general functions, the budget constraint can render mechanisms arbitrarily bad in terms of the utility of the buyer. However, our main result shows that for the important class of sub modular functions, a bounded approximation ratio is achievable. Better approximation results are obtained for subclasses of the sub modular functions. We explore the space of budget feasible mechanisms in other domains and give a characterization under more restricted conditions.
Budget feasible mechanism design studies procurement combinatorial auctions where the sellers have private costs to produce items, and the buyer(auctioneer) aims to maximize a social valuation function on subsets of … Budget feasible mechanism design studies procurement combinatorial auctions where the sellers have private costs to produce items, and the buyer(auctioneer) aims to maximize a social valuation function on subsets of items, under the budget constraint on the total payment. One of the most important questions in the field is "which valuation domains admit truthful budget feasible mechanisms with `small' approximations (compared to the social optimum)?" Singer showed that additive and submodular functions have such constant approximations. Recently, Dobzinski, Papadimitriou, and Singer gave an O(log^2 n)-approximation mechanism for subadditive functions; they also remarked that: "A fundamental question is whether, regardless of computational constraints, a constant-factor budget feasible mechanism exists for subadditive functions." We address this question from two viewpoints: prior-free worst case analysis and Bayesian analysis. For the prior-free framework, we use an LP that describes the fractional cover of the valuation function; it is also connected to the concept of approximate core in cooperative game theory. We provide an O(I)-approximation mechanism for subadditive functions, via the worst case integrality gap I of LP. This implies an O(log n)-approximation for subadditive valuations, O(1)-approximation for XOS valuations, and for valuations with a constant I. XOS valuations are an important class of functions that lie between submodular and subadditive classes. We give another polynomial time O(log n/loglog n) sub-logarithmic approximation mechanism for subadditive valuations. For the Bayesian framework, we provide a constant approximation mechanism for all subadditive functions, using the above prior-free mechanism for XOS valuations as a subroutine. Our mechanism allows correlations in the distribution of private information and is universally truthful.
We show that every universally truthful randomized mechanism for combinatorial auctions with submodular valuations that provides an approximation ratio of m1/ 2 -ε must use exponentially many value queries, where … We show that every universally truthful randomized mechanism for combinatorial auctions with submodular valuations that provides an approximation ratio of m1/ 2 -ε must use exponentially many value queries, where m is the number of items. In contrast, ignoring incentives there exist constant ratio approximation algorithms for this problem. Our approach is based on a novel direct hardness technique that completely skips the notoriously hard step of characterizing truthful mechanisms. The characterization step was the main obstacle for proving impossibility results in algorithmic mechanism design so far. We demonstrate two additional applications of our new technique: (1) an impossibility result for universally-truthful polynomial time flexible combinatorial public projects and (2) an impossibility result for truthful-in-expectation mechanisms for exact combinatorial public projects. The latter is the first result that bounds the power of polynomial-time truthful in expectation mechanisms in any setting.
We present algorithms for a class of resource allocation problems both in the online setting with stochastic input and in the offline setting. This class of problems contains many interesting … We present algorithms for a class of resource allocation problems both in the online setting with stochastic input and in the offline setting. This class of problems contains many interesting special cases such as the Adwords problem. In the online setting we introduce a new distributional model called the adversarial stochastic input model, which is a generalization of the i.i.d model with unknown distributions, where the distributions can change over time. In this model we give a 1-O(ε) approximation algorithm for the resource allocation problem, with almost the weakest possible assumption: the ratio of the maximum amount of resource consumed by any single request to the total capacity of the resource, and the ratio of the profit contributed by any single request to the optimal profit is at most (ε2/log(1/ε)2)/(log n + log (1/ε)) where n is the number of resources available. There are instances where this ratio is #949;2/log n such that no randomized algorithm can have a competitive ratio of 1-o(ε) even in the i.i.d model. The upper bound on ratio that we require improves on the previous upper-bound for the i.i.d case by a factor of n.
Budget feasible mechanism design studies procurement combinatorial auctions in the sellers have private costs to produce items, and the buyer (auctioneer) aims to maximize a social valuation function on subsets … Budget feasible mechanism design studies procurement combinatorial auctions in the sellers have private costs to produce items, and the buyer (auctioneer) aims to maximize a social valuation function on subsets of items, under the budget constraint on the total payment. One of the most important questions in the field is which valuation domains admit truthful budget feasible mechanisms with 'small' approximations (compared to the social optimum)? Singer [35] showed that additive and submodular functions have a constant approximation mechanism. Recently, Dobzinski, Papadimitriou, and Singer [20] gave an O(log2n) approximation mechanism for subadditive functions; further, they remarked that: A fundamental question is whether, regardless of computational constraints, a constant-factor budget feasible mechanism exists for subadditive In this paper, we address this question from two viewpoints: prior-free worst case analysis and Bayesian analysis, are two standard approaches from computer science and economics, respectively. - For the prior-free framework, we use a linear program (LP) that describes the fractional cover of the valuation function; the LP is also connected to the concept of approximate core in cooperative game theory. We provide a mechanism for subadditive functions whose approximation is O(I), via the worst case integrality gap I of this LP. This implies an O(log n)-approximation for subadditive valuations, O(1)-approximation for XOS valuations, as well as for valuations having a constant integrality gap. XOS valuations are an important class of functions and lie between the submodular and the subadditive classes of valuations. We further give another polynomial time O(log n/(log log n)) sub-logarithmic approximation mechanism for subadditive functions. Both of our mechanisms improve the best known approximation ratio O(log2 n). - For the Bayesian framework, we provide a constant approximation mechanism for all subadditive functions, using the above prior-free mechanism for XOS valuations as a subroutine. Our mechanism allows correlations in the distribution of private information and is universally truthful.
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is … A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the next coin. Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the 1930s, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this book, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model. This monograph is an ideal reference for students and researchers with an interest in bandit problems.
Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in … Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the Thirties, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this survey, we focus on two extreme cases in which the analysis of regret is particularly simple and elegant: i.i.d. payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, we also analyze some of the most important variants and extensions, such as the contextual bandit model.
Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, … Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, and maximum facility location problems. Unlike submodular minimization, submodular maximization is NP-hard. For the problem of maximizing a non-monotone submodular function, Feige, Mirrokni, and Vondrák recently developed a $2\over 5$-approximation algorithm \cite{FMV07}, however, their algorithms do not handle side constraints.} In this paper, we give the first constant-factor approximation algorithm for maximizing any non-negative submodular function subject to multiple matroid or knapsack constraints. We emphasize that our results are for {\em non-monotone} submodular functions. In particular, for any constant $k$, we present a $({1\over k+2+{1\over k}+ε})$-approximation for the submodular maximization problem under $k$ matroid constraints, and a $({1\over 5}-ε)$-approximation algorithm for this problem subject to $k$ knapsack constraints ($ε>0$ is any constant). We improve the approximation guarantee of our algorithm to ${1\over k+1+{1\over k-1}+ε}$ for $k\ge 2$ partition matroid constraints. This idea also gives a $({1\over k+ε})$-approximation for maximizing a {\em monotone} submodular function subject to $k\ge 2$ partition matroids, which improves over the previously best known guarantee of $\frac{1}{k+1}$.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Submodular Maximization by Simulated AnnealingShayan Oveis Gharan and Jan VondrákShayan Oveis Gharan and Jan … Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Submodular Maximization by Simulated AnnealingShayan Oveis Gharan and Jan VondrákShayan Oveis Gharan and Jan Vondrákpp.1098 - 1116Chapter DOI:https://doi.org/10.1137/1.9781611973082.83PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract We consider the problem of maximizing a non-negative (possibly non-monotone) submodular set function with or without constraints. Feige et al. [9] showed a 2/5-approximation for the unconstrained problem and also proved that no approximation better than 1/2 is possible in the value oracle model. Constant-factor approximation has been also known for submodular maximization subject to a matroid independence constraint (a factor of 0.309 [33]) and for submodular maximization subject to a matroid base constraint, provided that the fractional base packing number v is bounded away from 1 (a 1/4-approximation assuming that v ≥ 2 [33]). In this paper, we propose a new algorithm for submodular maximization which is based on the idea of simulated annealing. We prove that this algorithm achieves improved approximation for two problems: a 0.41-approximation for unconstrained submodular maximization, and a 0.325-approximation for submodular maximization subject to a matroid independence constraint. On the hardness side, we show that in the value oracle model it is impossible to achieve a 0.478-approximation for submodular maximization subject to a matroid independence constraint, or a 0.394-approximation subject to a matroid base constraint in matroids with two disjoint bases. Even for the special case of cardinality constraint, we prove it is impossible to achieve a 0.491-approximation. (Previously it was conceivable that a 1/2-approximation exists for these problems.) It is still an open question whether a 1/2-approximation is possible for unconstrained submodular maximization. Previous chapter Next chapter RelatedDetails Published:2011ISBN:978-0-89871-993-2eISBN:978-1-61197-308-2 https://doi.org/10.1137/1.9781611973082Book Series Name:ProceedingsBook Code:PR138Book Pages:xviii-1788
We address online learning in complex auction settings, such as sponsored search auctions, where the value of the bidder is unknown to her, evolving in an arbitrary manner and observed … We address online learning in complex auction settings, such as sponsored search auctions, where the value of the bidder is unknown to her, evolving in an arbitrary manner and observed only if the bidder wins an allocation. We leverage the structure of the utility of the bidder and the partial feedback that bidders typically receive in auctions, in order to provide algorithms with regret rates against the best fixed bid in hindsight, that are exponentially faster in convergence in terms of dependence on the action space, than what would have been derived by applying a generic bandit algorithm and almost equivalent to what would have been achieved in the full information setting. Our results are enabled by analyzing a new online learning setting with outcome-based feedback, which generalizes learning with feedback graphs. We provide an online learning algorithm for this setting, of independent interest, with regret that grows only logarithmically with the number of actions and linearly only in the number of potential outcomes (the latter being very small in most auction settings). Last but not least, we show that our algorithm outperforms the bandit approach experimentally and that this performance is robust to dropping some of our theoretical assumptions or introducing noise in the feedback that the bidder receives.
We consider the problem of maximizing a non-negative submodular set function f:2N -> RR+ over a ground set N subject to a variety of packing type constraints including (multiple) matroid … We consider the problem of maximizing a non-negative submodular set function f:2N -> RR+ over a ground set N subject to a variety of packing type constraints including (multiple) matroid constraints, knapsack constraints, and their intersections. In this paper we develop a general framework that allows us to derive a number of new results, in particular when f may be a non-monotone function. Our algorithms are based on (approximately) solving the multilinear extension F of f [5] over a polytope P that represents the constraints, and then effectively rounding the fractional solution. Although this approach has been used quite successfully in some settings [6, 22, 24, 13, 3], it has been limited in some important ways. We overcome these limitations as follows.
Signaling is an important topic in the study of asymmetric information in economic settings. In particular, the transparency of information available to a seller in an auction setting is a … Signaling is an important topic in the study of asymmetric information in economic settings. In particular, the transparency of information available to a seller in an auction setting is a question of major interest. We introduce the study of signaling when conducting a second price auction of a probabilistic good whose actual instantiation is known to the auctioneer but not to the bidders. This framework can be used to model impressions selling in display advertising. We establish several results within this framework. First, we study the problem of computing a signaling scheme that maximizes the auctioneer's revenue in a Bayesian setting. We show that this problem is polynomially solvable for some interesting special cases, but computationally hard in general. Second, we establish a tight bound on the minimum number of signals required to implement an optimal signaling scheme. Finally, we show that at least half of the maximum social welfare can be preserved within such a scheme.
We consider the design of computationally efficient online learning algorithms in an adversarial setting in which the learner has access to an offline optimization oracle. We present an algorithm called … We consider the design of computationally efficient online learning algorithms in an adversarial setting in which the learner has access to an offline optimization oracle. We present an algorithm called Generalized Followthe- Perturbed-Leader and provide conditions under which it is oracle-efficient while achieving vanishing regret. Our results make significant progress on an open problem raised by Hazan and Koren [1], who showed that oracle-efficient algorithms do not exist in full generality and asked whether one can identify conditions under which oracle-efficient online learning may be possible. Our auction-design framework considers an auctioneer learning an optimal auction for a sequence of adversarially selected valuations with the goal of achieving revenue that is almost as good as the optimal auction in hindsight, among a class of auctions. We give oracle-efficient learning results for: (1) VCG auctions with bidder-specific reserves in singleparameter settings, (2) envy-free item-pricing auctions in multiitem settings, and (3) the level auctions of Morgenstern and Roughgarden [2] for single-item settings. The last result leads to an approximation of the overall optimal Myerson auction when bidders' valuations are drawn according to a fast-mixing Markov process, extending prior work that only gave such guarantees for the i.i.d. setting.We also derive various extensions, including: (1) oracleefficient algorithms for the contextual learning setting in which the learner has access to side information (such as bidder demographics), (2) learning with approximate oracles such as those based on Maximal-in-Range algorithms, and (3) no-regret bidding algorithms in simultaneous auctions, which resolve an open problem of Daskalakis and Syrgkanis [3].
In this paper we study the fundamental problems of maximizing a continuous non-monotone submodular function over the hypercube, both with and without coordinate-wise concavity. This family of optimization problems has … In this paper we study the fundamental problems of maximizing a continuous non-monotone submodular function over the hypercube, both with and without coordinate-wise concavity. This family of optimization problems has several applications in machine learning, economics, and communication systems. Our main result is the first $\frac{1}{2}$-approximation algorithm for continuous submodular function maximization; this approximation factor of $\frac{1}{2}$ is the best possible for algorithms that only query the objective function at polynomially many points. For the special case of DR-submodular maximization, i.e. when the submodular functions is also coordinate wise concave along all coordinates, we provide a different $\frac{1}{2}$-approximation algorithm that runs in quasilinear time. Both of these results improve upon prior work [Bian et al, 2017, Soma and Yoshida, 2017]. Our first algorithm uses novel ideas such as reducing the guaranteed approximation problem to analyzing a zero-sum game for each coordinate, and incorporates the geometry of this zero-sum game to fix the value at this coordinate. Our second algorithm exploits coordinate-wise concavity to identify a monotone equilibrium condition sufficient for getting the required approximation guarantee, and hunts for the equilibrium point using binary search. We further run experiments to verify the performance of our proposed algorithms in related machine learning applications.
Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location and marketing over social networks. In this paper we consider the … Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location and marketing over social networks. In this paper we consider the problem of maximizing any submodular function subject to $d$ knapsack constraints, where $d$ is a fixed constant. We establish a strong relation between the discrete problem and its continuous relaxation, obtained through {\em extension by expectation} of the submodular function. Formally, we show that, for any non-negative submodular function, an $\alpha$-approximation algorithm for the continuous relaxation implies a randomized $(\alpha - \eps)$-approximation algorithm for the discrete problem. We use this relation to improve the best known approximation ratio for the problem to $1/4- \eps$, for any $\eps > 0$, and to obtain a nearly optimal $(1-e^{-1}-\eps)-$approximation ratio for the monotone case, for any $\eps>0$. We further show that the probabilistic domain defined by a continuous solution can be reduced to yield a polynomial size domain, given an oracle for the extension by expectation. This leads to a deterministic version of our technique.
Introduction, von Neumann's minimax theorem [10] can be stated as follows : if M and N are finite dimensional simplices and / is a bilinear function on MxN, then / … Introduction, von Neumann's minimax theorem [10] can be stated as follows : if M and N are finite dimensional simplices and / is a bilinear function on MxN, then / has a saddle point, i. e.There have been several generalizations of this theorem.J. Ville [9], A. Wald [11], and others [1] variously extended von Neumann's result to cases where M and N were allowed to be subsets of certain infinite dimensional linear spaces.The functions / they considered, however, were still linear.M. Shiffman [8] seems to have been the first to have considered concave-convex functions in a minimax theorem.H. Kneser [6], K. Fan [3], and C. Berge [2] (using induction and the method of separating two disjoint convex sets in Euclidean space by a hyperplane) got minimax theorems for concave-convex functions that are appropriately semi-continuous in one of the two variables.Although these theorems include the previous results as special cases, they can also be shown to be rather direct consequences of von Neumann's theorem.H. Nikaidό [7], on the other hand, using Brouwer's fixed point theorem, proved the existence of a saddle point for functions satisfying the weaker algebraic condition of being quasi-concave-convex, but the stronger topological condition of being continuous in each variable.Thus, there seem to be essentially two types of argument: one uses some form of separation of disjoint convex sets by a hyperplane and yields the theorem of Kneser-Fan (see 4.2), and the other uses a fixed point theorem and yields Nikaidό's result.ΐn this paper, we unify the two streams of thought by proving a minimax theorem for a function that is quasi-concave-convex and appropriately semi-continuous in each variable.The method of proof differs radically from any used previously.The difficulty lies in the fact that we cannot use a fixed point theorem (due to lack of continuity) nor the separation of disjoint convex sets by a hyperplane (due to lack of convexity).The key tool used is a theorem due to Knaster, Kuratowski, Mazurkiewicz based on Sperner's lemma.It may be of some interest to point out that, in all the minimax theorems, the crucial argument is carried out on spaces M and N that
The main goal of this paper is to develop a theory of inference of player valuations from observed data in the generalized second price auction without relying on the Nash … The main goal of this paper is to develop a theory of inference of player valuations from observed data in the generalized second price auction without relying on the Nash equilibrium assumption. Existing work in Economics on inferring agent values from data relies on the assumption that all participant strategies are best responses of the observed play of other players, i.e. they constitute a Nash equilibrium. In this paper, we show how to perform inference relying on a weaker assumption instead: assuming that players are using some form of no-regret learning. Learning outcomes emerged in recent years as an attractive alternative to Nash equilibrium in analyzing game outcomes, modeling players who haven't reached a stable equilibrium, but rather use algorithmic learning, aiming to learn the best way to play from previous observations. In this paper we show how to infer values of players who use algorithmic learning strategies. Such inference is an important first step before we move to testing any learning theoretic behavioral model on auction data. We apply our techniques to a dataset from Microsoft's sponsored search ad auction system.
Approachability theory, introduced by Blackwell (1956), provides fundamental results on repeated games with vector-valued payoffs, and has been usefully applied since in the theory of learning in games and to … Approachability theory, introduced by Blackwell (1956), provides fundamental results on repeated games with vector-valued payoffs, and has been usefully applied since in the theory of learning in games and to learning algorithms in the online adversarial setup. Given a repeated game with vector payoffs, a target set $S$ is approachable by a certain player (the agent) if he can ensure that the average payoff vector converges to that set no matter what his adversary opponent does. Blackwell provided two equivalent sets of conditions for a convex set to be approachable. The first (primary) condition is a geometric separation condition, while the second (dual) condition requires that the set be {\em non-excludable}, namely that for every mixed action of the opponent there exists a mixed action of the agent (a {\em response}) such that the resulting payoff vector belongs to $S$. Existing approachability algorithms rely on the primal condition and essentially require to compute at each stage a projection direction from a given point to $S$. In this paper, we introduce an approachability algorithm that relies on Blackwell's {\em dual} condition. Thus, rather than projection, the algorithm relies on computation of the response to a certain action of the opponent at each stage. The utility of the proposed algorithm is demonstrated by applying it to certain generalizations of the classical regret minimization problem, which include regret minimization with side constraints and regret minimization for global cost functions. In these problems, computation of the required projections is generally complex but a response is readily obtainable.
The friendship paradox is a sociological phenomenon first discovered by Feld which states that individuals are likely to have fewer friends than their friends do, on average. This phenomenon has … The friendship paradox is a sociological phenomenon first discovered by Feld which states that individuals are likely to have fewer friends than their friends do, on average. This phenomenon has become common knowledge, has several interesting applications, and has also been observed in various data sets. In his seminal paper Feld provides an intuitive explanation by showing that in any graph the average degree of edges in the graph is an upper bound on the average degree of nodes. Despite the appeal of this argument, it does not prove the existence of the friendship paradox. In fact, it is easy to construct networks -- even with power law degree distributions -- where the ratio between the average degree of neighbors and the average degree of nodes is high, but all nodes have the exact same degree as their neighbors. Which models, then, explain the friendship paradox? In this paper we give a strong characterization that provides a formal understanding of the friendship paradox. We show that for any power law graph with exponential parameter in (1,3), when every edge is rewired with constant probability, the friendship paradox holds, i.e. there is an asymptotic gap between the average degree of the sample of polylogarithmic size and the average degree of a random set of its neighbors of equal size. To examine this characterization on real data, we performed several experiments on social network data sets that complement our theoretical analysis. We also discuss the applications of our result to influence maximization.
A natural optimization model that formulates many online resource allocation problems is the online linear programming (LP) problem in which the constraint matrix is revealed column by column along with … A natural optimization model that formulates many online resource allocation problems is the online linear programming (LP) problem in which the constraint matrix is revealed column by column along with the corresponding objective coefficient. In such a model, a decision variable has to be set each time a column is revealed without observing the future inputs, and the goal is to maximize the overall objective function. In this paper, we propose a near-optimal algorithm for this general class of online problems under the assumptions of random order of arrival and some mild conditions on the size of the LP right-hand-side input. Specifically, our learning-based algorithm works by dynamically updating a threshold price vector at geometric time intervals, where the dual prices learned from the revealed columns in the previous period are used to determine the sequential decisions in the current period. Through dynamic learning, the competitiveness of our algorithm improves over the past study of the same problem. We also present a worst case example showing that the performance of our algorithm is near optimal.
In recent years, social networking platforms have developed into extraordinary channels for spreading and consuming information. Along with the rise of such infrastructure, there is continuous progress on techniques for … In recent years, social networking platforms have developed into extraordinary channels for spreading and consuming information. Along with the rise of such infrastructure, there is continuous progress on techniques for spreading information effectively through influential users. In many applications, one is restricted to select influencers from a set of users who engaged with the topic being promoted, and due to the structure of social networks, these users often rank low in terms of their influence potential. An alternative approach one can consider is an adaptive method which selects users in a manner which targets their influential neighbors. The advantage of such an approach is that it leverages the friendship paradox in social networks: while users are often not influential, they often know someone who is. Despite the various complexities in such optimization problems, we show that scalable adaptive seeding is achievable. In particular, we develop algorithms for linear influence models with provable approximation guarantees that can be gracefully parallelized. To show the effectiveness of our methods we collected data from various verticals social network users follow. For each vertical, we collected data on the users who responded to a certain post as well as their neighbors, and applied our methods on this data. Our experiments show that adaptive seeding is scalable, and importantly, that it obtains dramatic improvements over standard approaches of information dissemination.
We consider the linear contextual bandit problem with global convex constraints and a concave objective function. In each round, the outcome of pulling an arm is a vector, that depends … We consider the linear contextual bandit problem with global convex constraints and a concave objective function. In each round, the outcome of pulling an arm is a vector, that depends linearly on the context of that arm. The global constraints require the average of these vectors to lie in a certain convex set. The objective is a concave function of this average vector. This problem turns out to be a common generalization of classic linear contextual bandits (linContextual) [Auer 2003], bandits with concave rewards and convex knapsacks (BwCR) [Agrawal, Devanur 2014], and the online stochastic convex programming (OSCP) problem [Agrawal, Devanur 2015]. We present algorithms with near-optimal regret bounds for this problem. Our bounds compare favorably to results on the unstructured version of the problem [Agrawal et al. 2015, Badanidiyuru et al. 2014] where the relation between the contexts and the outcomes could be arbitrary, but the algorithm only competes against a fixed set of policies.
Constraints on agent's ability to pay play a major role in auction design for any setting where the magnitude of financial transactions is sufficiently large. Those constraints have been traditionally … Constraints on agent's ability to pay play a major role in auction design for any setting where the magnitude of financial transactions is sufficiently large. Those constraints have been traditionally modeled in mechanism design as hard budget, i.e., mechanism is not allowed to charge agents more than a certain amount. Yet, real auction systems (such as Google AdWords) allow more sophisticated constraints on agents' ability to pay, such as average budgets. In this work, we investigate the design of Pareto optimal and incentive compatible auctions for agents with constrained quasi-linear utilities, which captures more realistic models of liquidity constraints that the agents may have. Our result applies to a very general class of allocation constraints known as polymatroidal environments, encompassing many settings of interest such as multi-unit auctions, matching markets, video-on demand and advertisement systems.
We study nonparametric maximum likelihood estimation of a log-concave probability density and its distribution and hazard function. Some general properties of these estimators are derived from two characterizations. It is … We study nonparametric maximum likelihood estimation of a log-concave probability density and its distribution and hazard function. Some general properties of these estimators are derived from two characterizations. It is shown that the rate of convergence with respect to supremum norm on a compact interval for the density and hazard rate estimator is at least $(\log(n)/n)^{1/3}$ and typically $(\log(n)/n)^{2/5}$, whereas the difference between the empirical and estimated distribution function vanishes with rate $o_{\mathrm{p}}(n^{-1/2})$ under certain regularity assumptions.
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES Get access WILLIAM R THOMPSON WILLIAM R THOMPSON From the Department of Pathology, … ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES Get access WILLIAM R THOMPSON WILLIAM R THOMPSON From the Department of Pathology, Yale University Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 25, Issue 3-4, December 1933, Pages 285–294, https://doi.org/10.1093/biomet/25.3-4.285 Published: 01 December 1933
A number of recent results on optimization problems involving submodular functions have made use of the multilinear relaxation of the problem. These results hold typically in the value oracle model, … A number of recent results on optimization problems involving submodular functions have made use of the multilinear relaxation of the problem. These results hold typically in the value oracle model, where the objective function is accessible via a black box returning $f(S)$ for a given $S$. We present a general approach to deriving inapproximability results in the value oracle model, based on the notion of symmetry gap. Our main result is that for any fixed instance that exhibits a certain symmetry gap in its multilinear relaxation, there is a naturally related class of instances for which a better approximation factor than the symmetry gap would require exponentially many oracle queries. This unifies several known hardness results for submodular maximization, e.g., the optimality of $(1-1/e)$-approximation for monotone submodular maximization under a cardinality constraint and the impossibility of $(\frac12+\epsilon)$-approximation for unconstrained (nonmonotone) submodular maximization. As a new application, we consider the problem of maximizing a nonmonotone submodular function over the bases of a matroid. A $(\frac16-o(1))$-approximation has been developed for this problem, assuming that the matroid contains two disjoint bases. We show that the best approximation one can achieve is indeed related to packings of bases in the matroid. Specifically, for any $k \geq 2$, there is a class of matroids of fractional base packing number $\nu = \frac{k}{k-1}$ such that any algorithm achieving a better than $(1-\frac{1}{\nu}) = \frac{1}{k}$-approximation for this class would require exponentially many value queries. In particular, there is no constant factor approximation for maximizing a nonmonotone submodular function over the bases of a general matroid. On the positive side, we present a $\frac12 (1-\frac{1}{\nu}-o(1))$-approximation algorithm assuming fractional base packing number at least $\nu$, where $\nu \in (1,2]$. We also present an improved $0.309$-approximation for maximization of a nonmonotone submodular function subject to a matroid independence constraint, improving the previously known factor of $\frac14-\epsilon$. For this problem, we obtain a hardness of $(\frac12 + \epsilon)$-approximation for any fixed $\epsilon>0$.
The promise of data science is that if data from a system can be recorded and understood then this understanding can potentially be utilized to improve the system. Behavioral and … The promise of data science is that if data from a system can be recorded and understood then this understanding can potentially be utilized to improve the system. Behavioral and economic data, however, is different from scientific data in that it is subjective to the system. Behavior changes when the system changes, and to predict behavior for any given system change or to optimize over system changes, the behavioral model that generates the data must be inferred from the data. The ease with which this inference can be performed generally also depends on the system. Trivially, a system that ignores behavior does not admit any inference of a behavior generating model that can be used to predict behavior in a system that is responsive to behavior.
Watch a martingale with uniformly bounded increments until it first crosses the horizontal line of height $a$. The sum of the conditional variances of the increments given the past, up … Watch a martingale with uniformly bounded increments until it first crosses the horizontal line of height $a$. The sum of the conditional variances of the increments given the past, up to the crossing, is an intrinsic measure of the crossing time. Simple and fairly sharp upper and lower bounds are given for the Laplace transform of this crossing time, which show that the distribution is virtually the same as that for the crossing time of Brownian motion, even in the tail. The argument can be adapted to extend inequalities of Bernstein and Kolmogorov to the dependent case, proving the law of the iterated logarithm for martingales. The argument can also be adapted to prove Levy's central limit theorem for martingales. The results can be extended to martingales whose increments satisfy a growth condition.
Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the … Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficient algorithms (i.e., with constant per-step complexity) for this setup. We finally consider external regret and internal regret in repeated games with partial monitoring and derive regret-minimizing strategies based on approachability theory.
The design of computationally efficient and incentive compatible mechanisms that solve or approximate fundamental resource allocation problems is the main goal of algorithmic mechanism design. A central example in both … The design of computationally efficient and incentive compatible mechanisms that solve or approximate fundamental resource allocation problems is the main goal of algorithmic mechanism design. A central example in both theory and practice is welfare-maximization in combinatorial auctions. Recently, a randomized mechanism has been discovered for combinatorial auctions that is truthful in expectation and guarantees a (1-1/e)-approximation to the optimal social welfare when players have coverage valuations [DRY11]. This approximation ratio is the best possible even for non-truthful algorithms, assuming P does not equal NP. Given the recent sequence of negative results for combinatorial auctions under more restrictive notions of incentive compatibility, this development raises a natural question: Are truthful-in-expectation mechanisms compatible with polynomial-time approximation in a way that deterministic or universally truthful mechanisms are not? In particular, can polynomial-time truthful-in-expectation mechanisms guarantee a near-optimal approximation ratio for more general variants of combinatorial auctions? We prove that this is not the case. Specifically, the result of [DRY11] cannot be extended to combinatorial auctions with sub modular valuations in the value oracle model. (Absent strategic considerations, a (1-1/e)-approximation is still achievable in this setting.) More precisely, we prove that there is a constant \gamma>0 such that there is no randomized mechanism that is truthful-in-expectation -- or even approximately truthful-in-expectation -- and guarantees an m^{-\gamma}-approximation to the optimal social welfare for combinatorial auctions with sub modular valuations in the value oracle model. We also prove an analogous result for the flexible combinatorial public projects (CPP) problem, where a truthful-in-expectation $(1-1/e)$-approximation for coverage valuations has been recently developed [Dughmi11]. We show that there is no truthful-in-expectation -- or even approximately truthful-in-expectation -- mechanism that achieves an m^{-\gamma}-approximation to the optimal social welfare for combinatorial public projects with sub modular valuations in the value oracle model. Both our results present an unexpected separation between coverage functions and sub modular functions, which does not occur for these problems without strategic considerations.
In this paper, we consider a very general model for exploration-exploitation tradeoff which allows arbitrary concave rewards and convex constraints on the decisions across time, in addition to the customary … In this paper, we consider a very general model for exploration-exploitation tradeoff which allows arbitrary concave rewards and convex constraints on the decisions across time, in addition to the customary limitation on the time horizon. This model subsumes the classic multi-armed bandit (MAB) model, and the Bandits with Knapsacks (BwK) model of Badanidiyuru et al.[2013]. We also consider an extension of this model to allow linear contexts, similar to the linear contextual extension of the MAB model. We demonstrate that a natural and simple extension of the UCB family of algorithms for MAB provides a polynomial time algorithm that has near-optimal regret guarantees for this substantially more general model, and matches the bounds provided by Badanidiyuru et al.[2013] for the special case of BwK, which is quite surprising. We also provide computationally more efficient algorithms by establishing interesting connections between this problem and other well studied problems/algorithms such as the Blackwell approachability problem, online convex optimization, and the Frank-Wolfe technique for convex optimization.
We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, … We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to N experts in total O(√N) computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is Θ(N). These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle—i.e., an efficient empirical risk minimizer—allows to learn a finite hypothesis class of size N in time O(logN).
In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. … In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set of content providers and consumers form a bipartite network while consumers also form their social network, and influence propagates from the content providers to consumers and among consumers in the social network following the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed consumers, such that the influence from the seed providers passing through the seed consumers could reach a large number of consumers in the social network in expectation.
We consider the problem of designing revenue maximizing online posted-price mechanisms when the seller has limited supply. A seller has k identical items for sale and is facing n potential … We consider the problem of designing revenue maximizing online posted-price mechanisms when the seller has limited supply. A seller has k identical items for sale and is facing n potential buyers ("agents") that are arriving sequentially. Each agent is interested in buying one item. Each agent's value for an item is an independent sample from some fixed (but unknown) distribution with support [0,1]. The seller offers a take-it-or-leave-it price to each arriving agent (possibly different for different agents), and aims to maximize his expected revenue.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. … Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines aspects of stochastic integer programming with online learning. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems. We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel "balanced exploration" paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply.
Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location, and marketing over social networks. In this paper we consider the … Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location, and marketing over social networks. In this paper we consider the problem of maximizing any submodular function subject to d knapsack constraints, where d is a fixed constant. We establish a strong relation between the discrete problem and its continuous relaxation, obtained through extension by expectation of the submodular function. Formally, we show that, for any nonnegative submodular function, an α-approximation algorithm for the continuous relaxation implies a randomized (α − ε)-approximation algorithm for the discrete problem. We use this relation to obtain an (e −1 − ε)-approximation for the problem, and a nearly optimal (1 − e −1 − ε)-approximation ratio for the monotone case, for any ε > 0. We further show that the probabilistic domain defined by a continuous solution can be reduced to yield a polynomial-size domain, given an oracle for the extension by expectation. This leads to a deterministic version of our technique.
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of $n$ trials so as to maximize the total payoff of the chosen … In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of $n$ trials so as to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is quite well understood, bandit problems with large strategy sets are still a topic of very active investigation, motivated by practical applications such as online auctions and web advertisement. The goal of such research is to identify broad and natural classes of strategy sets and payoff functions which enable the design of efficient solutions.
We consider a multiround auction setting motivated by pay-per-click auctions for Internet advertising. In each round the auctioneer selects an advertiser and shows her ad, which is then either clicked … We consider a multiround auction setting motivated by pay-per-click auctions for Internet advertising. In each round the auctioneer selects an advertiser and shows her ad, which is then either clicked or not. An advertiser derives value from clicks; the value of a click is her private information. Initially, neither the auctioneer nor the advertisers have any information about the likelihood of clicks on the advertisements. The auctioneer's goal is to design a (dominant strategies) truthful mechanism that (approximately) maximizes the social welfare. If the advertisers bid their true private values, our problem is equivalent to the multi-armed bandit problem, and thus can be viewed as a strategic version of the latter. In particular, for both problems the quality of an algorithm can be characterized by regret, the difference in social welfare between the algorithm and the benchmark which always selects the same "best" advertisement. We investigate how the design of multi-armed bandit algorithms is affected by the restriction that the resulting mechanism must be truthful. We find that deterministic truthful mechanisms have certain strong structural properties---essentially, they must separate exploration from exploitation---and they incur much higher regret than the optimal multi-armed bandit algorithms. Moreover, we provide a truthful mechanism which (essentially) matches our lower bound on regret.
We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of order $\sqrt{d n … We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of order $\sqrt{d n \log N}$ for any finite action set with $N$ actions, under the assumption that the instantaneous loss is bounded by 1. This shaves off an extraneous $\sqrt{d}$ factor compared to previous works, and gives a regret bound of order $d \sqrt{n \log n}$ for any compact set of actions. Without further assumptions on the action set, this last bound is minimax optimal up to a logarithmic factor. Interestingly, our result also shows that the minimax regret for bandit linear optimization with expert advice in $d$ dimension is the same as for the basic $d$-armed bandit with expert advice. Our second contribution is to show how to use the Mirror Descent algorithm to obtain computationally efficient strategies with minimax optimal regret bounds in specific examples. More precisely we study two canonical action sets: the hypercube and the Euclidean ball. In the former case, we obtain the first computationally efficient algorithm with a $d \sqrt{n}$ regret, thus improving by a factor $\sqrt{d \log n}$ over the best known result for a computationally efficient algorithm. In the latter case, our approach gives the first algorithm with a $\sqrt{d n \log n}$ regret, again shaving off an extraneous $\sqrt{d}$ compared to previous works.
We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform … We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection methods in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2 million customers and by analysing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad hoc modular networks.
We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci … We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a set of items. Each item i has a cost ci and the buyer is interested in a set S that maximizes v(S) subject to ∑i∈Sci ≤ β. Special cases of combinatorial procurement auctions are well-studied problems from submodular optimization. In particular, when the costs are all equal (cardinality constraint), a classic result by Nemhauser et al shows that the greedy algorithm provides an e/e-1 approximation.
Budget feasible mechanism considers algorithmic mechanism design questions where there is a budget constraint on the total payment of the mechanism. An important question in the field is that under … Budget feasible mechanism considers algorithmic mechanism design questions where there is a budget constraint on the total payment of the mechanism. An important question in the field is that under which valuation domains there exist budget feasible mechanisms that admit `small' approximations (compared to a socially optimal solution). Singer \cite{PS10} showed that additive and submodular functions admit a constant approximation mechanism. Recently, Dobzinski, Papadimitriou, and Singer \cite{DPS11} gave an $O(\log^2n)$ approximation mechanism for subadditive functions and remarked that: "A fundamental question is whether, regardless of computational constraints, a constant-factor budget feasible mechanism exists for subadditive function." In this paper, we give the first attempt to this question. We give a polynomial time $O(\frac{\log n}{\log\log n})$ sub-logarithmic approximation ratio mechanism for subadditive functions, improving the best known ratio $O(\log^2 n)$. Further, we connect budget feasible mechanism design to the concept of approximate core in cooperative game theory, and show that there is a mechanism for subadditive functions whose approximation is, via a characterization of the integrality gap of a linear program, linear to the largest value to which an approximate core exists. Our result implies in particular that the class of XOS functions, which is a superclass of submodular functions, admits a constant approximation mechanism. We believe that our work could be a solid step towards solving the above fundamental problem eventually, and possibly, with an affirmative answer.
We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her … We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the minimal loss she would have achieved by picking, in hindsight, the best possible action. Our goal is to understand the magnitude of the best possible (minimax) regret. We study the problem under three different assumptions for the feedback the decision maker receives: full information, and the partial information models of the so-called “semi-bandit” and “bandit” problems. In the full information case we show that the standard exponentially weighted average forecaster is a provably suboptimal strategy. For the semi-bandit model, by combining the Mirror Descent algorithm and the INF (Implicitely Normalized Forecaster) strategy, we are able to prove the first optimal bounds. Finally, in the bandit case we discuss existing results in light of a new lower bound, and suggest a conjecture on the optimal regret in that case.
Budget feasible mechanisms, recently initiated by Singer (FOCS 2010), extend algorithmic mechanism design problems to a realistic setting with a budget constraint. We consider the problem of designing truthful budget … Budget feasible mechanisms, recently initiated by Singer (FOCS 2010), extend algorithmic mechanism design problems to a realistic setting with a budget constraint. We consider the problem of designing truthful budget feasible mechanisms for monotone submodular functions: We give a randomized mechanism with an approximation ratio of 7.91 (improving on the previous best-known result 233.83), and a deterministic mechanism with an approximation ratio of 8.34. We also study the knapsack problem, which is a special submodular function, give a 2 + √2 approximation deterministic mechanism (improving on the previous best-known result 5), and a 3 approximation randomized mechanism. We provide similar results for an extended knapsack problem with heterogeneous items, where items are divided into groups and one can pick at most one item from each group.Finally we show a lower bound of 1 + √2 for the approximation ratio of deterministic mechanisms and 2 for randomized mechanisms for knapsack, as well as the general monotone submodular functions. Our lower bounds are unconditional, and do not rely on any computational or complexity assumptions.
We present an analysis of a person-to-person recommendation network, consisting of 4 million people who made 16 million recommendations on half a million products. We observe the propagation of recommendations … We present an analysis of a person-to-person recommendation network, consisting of 4 million people who made 16 million recommendations on half a million products. We observe the propagation of recommendations and the cascade sizes, which we explain by a simple stochastic model. We then establish how the recommendation network grows over time and how effective it is from the viewpoint of the sender and receiver of the recommendations. While on average recommendations are not very effective at inducing purchases and do not spread very far, we present a model that successfully identifies product and pricing categories for which viral marketing seems to be very effective.