In this article, we will formalize the method of dual fitting and the idea of factor-revealing LP. This combination is used to design and analyze two greedy algorithms for the âŠ
In this article, we will formalize the method of dual fitting and the idea of factor-revealing LP. This combination is used to design and analyze two greedy algorithms for the metric uncapacitated facility location problem. Their approximation factors are 1.861 and 1.61, with running times of O ( m log m ) and O ( n 3 ), respectively, where n is the total number of vertices and m is the number of edges in the underlying complete bipartite graph between cities and facilities. The algorithms are used to improve recent results for several variants of the problem.
The effective resistance between two nodes of a weighted graph is the electrical resistance seen between the nodes of a resistor network with branch conductances given by the edge weights. âŠ
The effective resistance between two nodes of a weighted graph is the electrical resistance seen between the nodes of a resistor network with branch conductances given by the edge weights. The effective resistance comes up in many applications and fields in addition to electrical network analysis, including, for example, Markov chains and continuous-time averaging networks. In this paper we study the problem of allocating edge weights on a given graph in order to minimize the total effective resistance, i.e., the sum of the resistances between all pairs of nodes. We show that this is a convex optimization problem and can be solved efficiently either numerically or, in some cases, analytically. We show that optimal allocation of the edge weights can reduce the total effective resistance of the graph (compared to uniform weights) by a factor that grows unboundedly with the size of the graph. We show that among all graphs with n nodes, the path has the largest value of optimal total effective resistance and the complete graph has the least.
We consider the online stochastic matching problem proposed by Feldman et al. [Feldman J, Mehta A, Mirrokni VS, Muthukrishnan S (2009) Online stochastic matching: Beating 1 â 1/e. Annual IEEE âŠ
We consider the online stochastic matching problem proposed by Feldman et al. [Feldman J, Mehta A, Mirrokni VS, Muthukrishnan S (2009) Online stochastic matching: Beating 1 â 1/e. Annual IEEE Sympos. Foundations Comput. Sci. 117â126] as a model of display ad allocation. We are given a bipartite graph; one side of the graph corresponds to a fixed set of bins, and the other side represents the set of possible ball types. At each time step, a ball is sampled independently from the given distribution and it needs to be matched upon its arrival to an empty bin. The goal is to maximize the number of allocations. We present an online algorithm for this problem with a competitive ratio of 0.702. Before our result, algorithms with a competitive ratio better than 1 â 1/e were known under the assumption that the expected number of arriving balls of each type is integral. A key idea of the algorithm is to collect statistics about the decisions of the optimum offline solution using Monte Carlo sampling and use those statistics to guide the decisions of the online algorithm. We also show that our algorithm achieves a competitive ratio of 0.705 when the rates are integral. On the hardness side, we prove that no online algorithm can have a competitive ratio better than 0.823 under the known distribution model (and henceforth under the permutation model). This improves upon the 5/6 hardness result proved by Goel and Mehta [Goel G, Mehta A (2008) Online budgeted matching in random input models with applications to adwords. ACM-SIAM Symposium Discrete Algorithms 982â991] for the permutation model.
For some positive constant Ï” <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> , we give a (3/2-Ï” <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> )-approximation algorithm for the following problem: given a graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> = âŠ
For some positive constant Ï” <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> , we give a (3/2-Ï” <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> )-approximation algorithm for the following problem: given a graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> = (V,V <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> ), find the shortest tour that visits every vertex at least once. This is a special case of the metric traveling salesman problem when the underlying metric is defined by shortest path distances in Go. The result improves on the 3/2-approximation algorithm due to Christofides [13] for this special case. Similar to Christofides, our algorithm finds a spanning tree whose cost is upper bounded by the optimum, then it finds the minimum cost Eulerian augmentation (or T-join) of that tree. The main difference is in the selection of the spanning tree. Except in certain cases where the solution of LP is nearly integral, we select the spanning tree randomly by sampling from a maximum entropy distribution defined by the linear programming relaxation. Despite the simplicity of the algorithm, the analysis builds on a variety of ideas such as properties of strongly Rayleigh measures from probability theory, graph theoretical results on the structure of near minimum cuts, and the integrality of the T-join polytope from polyhedral theory. Also, as a byproduct of our result, we show new properties of the near minimum cuts of any graph, which may be of independent interest.
Network alignment generalizes and unifies several approaches for forming a matching or alignment between the vertices of two graphs. We study a mathematical programming framework for network alignment problem and âŠ
Network alignment generalizes and unifies several approaches for forming a matching or alignment between the vertices of two graphs. We study a mathematical programming framework for network alignment problem and a sparse variation of it where only a small number of matches between the vertices of the two graphs are possible. We propose a new message passing algorithm that allows us to compute, very efficiently, approximate solutions to the sparse network alignment problems with graph sizes as large as hundreds of thousands of vertices. We also provide extensive simulations comparing our algorithms with two of the best solvers for network alignment problems on two synthetic matching problems, two bioinformatics problems, and three large ontology alignment problems including a multilingual problem with a known labeled alignment.
We study the problem of computing maximin share guarantees, a recently introduced fairness notion. Given a set of $n$ agents and a set of goods, the maximin share of a âŠ
We study the problem of computing maximin share guarantees, a recently introduced fairness notion. Given a set of $n$ agents and a set of goods, the maximin share of a single agent is the best that she can guarantee to herself, if she would be allowed to partition the goods in any way she prefers, into $n$ bundles, and then receive her least desirable bundle. The objective then in our problem is to find a partition, so that each agent is guaranteed her maximin share. In settings with indivisible goods, such allocations are not guaranteed to exist, so we resort to approximation algorithms. Our main result is a $2/3$-approximation, that runs in polynomial time for any number of agents. This improves upon the algorithm of Procaccia and Wang, which also produces a $2/3$-approximation but runs in polynomial time only for a constant number of agents. To achieve this, we redesign certain parts of their algorithm. Furthermore, motivated by the apparent difficulty, both theoretically and experimentally, in finding lower bounds on the existence of approximate solutions, we undertake a probabilistic analysis. We prove that in randomly generated instances, with high probability there exists a maximin share allocation. This can be seen as a justification of the experimental evidence reported in relevant works. Finally, we provide further positive results for two special cases that arise from previous works. The first one is the intriguing case of $3$ agents, for which it is already known that exact maximin share allocations do not always exist (contrary to the case of $2$ agents). We provide a $7/8$-approximation algorithm, improving the previously known result of $3/4$. The second case is when all item values belong to $\{0, 1, 2\}$, extending the $\{0, 1\}$ setting studied in Bouveret and Lema\^itre. We obtain an exact algorithm for any number of agents in this case.
We give an explicit construction of the weak local limit of a class of preferential attachment graphs. This limit contains all local information and allows several computations that are otherwise âŠ
We give an explicit construction of the weak local limit of a class of preferential attachment graphs. This limit contains all local information and allows several computations that are otherwise hard, for example, joint degree distributions and, more generally, the limiting distribution of subgraphs in balls of any given radius $k$ around a random vertex in the preferential attachment graph. We also establish the finite-volume corrections which give the approach to the limit.
We consider a variation of the spectral sparsification problem where we are required to keep a subgraph of the original graph. Formally, given a union of two weighted graphs G âŠ
We consider a variation of the spectral sparsification problem where we are required to keep a subgraph of the original graph. Formally, given a union of two weighted graphs G and W and an integer k, we are asked to find a k-edge weighted graph Wk such that G+Wk is a good spectral sparsifer of G+W. We will refer to this problem as the subgraph (spectral) sparsification. We present a nontrivial condition on G and W such that a good sparsifier exists and give a polynomial-time algorithm to find the sparsifer.
We present a randomized O(log n/log log n)-approximation algorithm for the asymmetric traveling salesman problem (ATSP). This provides the first asymptotic improvement over the long-standing Î(log n)-approximation bound stemming from âŠ
We present a randomized O(log n/log log n)-approximation algorithm for the asymmetric traveling salesman problem (ATSP). This provides the first asymptotic improvement over the long-standing Î(log n)-approximation bound stemming from the work of Frieze et al. (1982) [Frieze AM, Galbiati G, Maffioki F (1982) On the worst-case performance of some algorithms for the asymmetric traveling salesman problem. Networks 12(1):23â39]. The key ingredient of our approach is a new connection between the approximability of the ATSP and the notion of so-called thin trees. To exploit this connection, we employ maximum entropy roundingâa novel method of randomized rounding of LP relaxations of optimization problems. We believe that this method might be of independent interest.
It has been observed that the degrees of the topologies of several communication networks follow heavy tailed statistics. What is the impact of such heavy tailed statistics on the performance âŠ
It has been observed that the degrees of the topologies of several communication networks follow heavy tailed statistics. What is the impact of such heavy tailed statistics on the performance of basic communication tasks that a network is presumed to support? How does performance scale with the size of the network? We study routing in families of sparse random graphs whose degrees follow heavy tailed distributions. Instantiations of such random graphs have been proposed as models for the topology of the Internet at the level of Autonomous Systems as well as at the level of routers. Let n be the number of nodes. Suppose that for each pair of nodes with degrees d u and d v we have O(d u d v ) units of demand. Thus the total demand is O(n 2 ) . We argue analytically and experimentally that in the considered random graph model such demand patterns can be routed so that the flow through each link is at most O(n log 2 n) . This is to be compared with a bound O(n 2 ) that holds for arbitrary graphs. Similar results were previously known for sparse random regular graphs, a.k.a. "expander graphs." The significance is that Internet-like topologies, which grow in a dynamic, decentralized fashion and appear highly inhomogeneous, can support routing with performance characteristics comparable to those of their regular counterparts, at least under the assumption of uniform demand and capacities. Our proof uses approximation algorithms for multicommodity flow and establishes strong bounds of a generalization of "expansion," namely "conductance." Besides routing, our bounds on conductance have further implications, most notably on the gap between first and second eigenvalues of the stochastic normalization of the adjacency matrix of the graph.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Online Stochastic Matching: Online Actions Based on Offline StatisticsVahideh H. Manshadi, Shayan Oveis Gharan, âŠ
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Online Stochastic Matching: Online Actions Based on Offline StatisticsVahideh H. Manshadi, Shayan Oveis Gharan, and Amin SaberiVahideh H. Manshadi, Shayan Oveis Gharan, and Amin Saberipp.1285 - 1294Chapter DOI:https://doi.org/10.1137/1.9781611973082.98PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract We consider the online stochastic matching problem proposed by Feldman et al. [4] as a model of display ad allocation. We are given a bipartite graph; one side of the graph corresponds to a fixed set of bins and the other side represents the set of possible ball types. At each time step, a ball is sampled independently from the given distribution and it needs to be matched upon its arrival to an empty bin. The goal is to maximize the size of the matching. We present an online algorithm for this problem with a competitive ratio of 0.702. Before our result, algorithms with a competitive ratio better than 1 â 1/e were known under the assumption that the expected number of arriving balls of each type is integral. A key idea of the algorithm is to collect statistics about the decisions of the optimum offline solution using Monte Carlo sampling and use those statistics to guide the decisions of the online algorithm. We also show that no online algorithm can have a competitive ratio better than 0.823. Previous chapter Next chapter RelatedDetails Published:2011ISBN:978-0-89871-993-2eISBN:978-1-61197-308-2 https://doi.org/10.1137/1.9781611973082Book Series Name:ProceedingsBook Code:PR138Book Pages:xviii-1788
We show that in power law random graphs, almost surely, the expected rate at which a random walk with lookahead discovers the nodes of the graph is sublinear.
We show that in power law random graphs, almost surely, the expected rate at which a random walk with lookahead discovers the nodes of the graph is sublinear.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)The Asymmetric Traveling Salesman Problem on Graphs with Bounded GenusShayan Oveis Gharan and Amin âŠ
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2011 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)The Asymmetric Traveling Salesman Problem on Graphs with Bounded GenusShayan Oveis Gharan and Amin SaberiShayan Oveis Gharan and Amin Saberipp.967 - 975Chapter DOI:https://doi.org/10.1137/1.9781611973082.75PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract We give a constant factor approximation algorithm for the asymmetric traveling salesman problem when the support graph of the solution of the Held-Karp linear programming relaxation has bounded orientable genus. Previous chapter Next chapter RelatedDetails Published:2011ISBN:978-0-89871-993-2eISBN:978-1-61197-308-2 https://doi.org/10.1137/1.9781611973082Book Series Name:ProceedingsBook Code:PR138Book Pages:xviii-1788
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows âŠ
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it?
We design a deterministic polynomial time cn approximation algorithm for the permanent of positive semidefinite matrices where c = <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Îł+1</sup> â 4:84. We write a natural convex relaxation âŠ
We design a deterministic polynomial time cn approximation algorithm for the permanent of positive semidefinite matrices where c = <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Îł+1</sup> â 4:84. We write a natural convex relaxation and show that its optimum solution gives a cn approximation of the permanent. We further show that this factor is asymptotically tight by constructing a family of positive semidefinite matrices. We also show that our result implies an approximate version of the permanent-ontop conjecture, which was recently refuted in its original form; we show that the permanent is within a cn factor of the top eigenvalue of the Schur power matrix.
Gamification represents an effective way to incentivize user behavior across a number of computing applications. However, despite the fact that physical activity is essential for a healthy lifestyle, surprisingly little âŠ
Gamification represents an effective way to incentivize user behavior across a number of computing applications. However, despite the fact that physical activity is essential for a healthy lifestyle, surprisingly little is known about how gamification and in particular competitions shape human physical activity. Here we study how competitions affect physical activity. We focus on walking challenges in a mobile activity tracking application where multiple users compete over who takes the most steps over a predefined number of days. We synthesize our findings in a series of game and app design implications. In particular, we analyze nearly 2,500 physical activity competitions over a period of one year capturing more than 800,000 person days of activity tracking. We observe that during walking competitions, the average user increases physical activity by 23%. Furthermore, there are large increases in activity for both men and women across all ages, and weight status, and even for users that were previously fairly inactive. We also find that the composition of participants greatly affects the dynamics of the game. In particular, if highly unequal participants get matched to each other, then competition suffers and the overall effect on the physical activity drops significantly. Furthermore, competitions with an equal mix of both men and women are more effective in increasing the level of activities. We leverage these insights to develop a statistical model to predict whether or not a competition will be particularly engaging with significant accuracy. Our models can serve as a guideline to help design more engaging competitions that lead to most beneficial behavioral changes.
In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to âŠ
In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to different types of structural constraints. The focus of this paper is on the case where the allowable subsets of served customers are characterized by a laminar matroid with constant depth. This problem is a special case of the well-known matroid Bayesian online selection problem studied in [Kleinberg & Weinberg, 2012], when the underlying matroid is laminar. We give the first Polynomial-Time Approximation Scheme (PTAS) for the above problem. Our approach is based on rounding the solution of a hierarchy of linear programming relaxations that can approximate the optimum online solution with any degree of accuracy as well as a concentration argument that shows our rounding does not have a considerable loss in the expected social welfare.
We study the problem of allocating $m$ items to $n$ agents subject to maximizing the Nash social welfare (NSW) objective. We write a novel convex programming relaxation for this problem, âŠ
We study the problem of allocating $m$ items to $n$ agents subject to maximizing the Nash social welfare (NSW) objective. We write a novel convex programming relaxation for this problem, and we show that a simple randomized rounding algorithm gives a $1/e$ approximation factor of the objective. Our main technical contribution is an extension of Gurvits's lower bound on the coefficient of the square-free monomial of a degree $m$-homogeneous stable polynomial on $m$ variables to all homogeneous polynomials. We use this extension to analyze the expected welfare of the allocation returned by our randomized rounding algorithm.
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] âŠ
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons. Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their l-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2009 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Generating Random Graphs with Large GirthMohsen Bayati, Andrea Montanari, and Amin SaberiMohsen Bayati, Andrea âŠ
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2009 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Generating Random Graphs with Large GirthMohsen Bayati, Andrea Montanari, and Amin SaberiMohsen Bayati, Andrea Montanari, and Amin Saberipp.566 - 575Chapter DOI:https://doi.org/10.1137/1.9781611973068.63PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract We present a simple and efficient algorithm for randomly generating simple graphs without small cycles. These graphs can be used to design high performance Low-Density Parity-Check (LDPC) codes. For any constant k, α †1/2k(k + 3) and m = O(n1+α), our algorithm generates an asymptotically uniform random graph with n vertices, m edges, and girth larger than k in polynomial time. To the best of our knowledge this is the first polynomial algorithm for the problem. Our algorithm generates a graph by sequentially adding m edges to an empty graph with n vertices. Recently, this type of sequential process has been very successful for efficiently counting and generating random graphs [35, 18, 11, 7, 5, 6]. Previous chapter Next chapter RelatedDetails Published:2009ISBN:978-0-89871-680-1eISBN:978-1-61197-306-8 https://doi.org/10.1137/1.9781611973068Book Series Name:ProceedingsBook Code:PR132Book Pages:xviii + 1288
We study the envy-free cake-cutting problem for $d+1$ players with $d$ cuts, for both the oracle function model and the polynomial time function model. For the former, we derive a âŠ
We study the envy-free cake-cutting problem for $d+1$ players with $d$ cuts, for both the oracle function model and the polynomial time function model. For the former, we derive a $Ξ(({1\overΔ})^{d-1})$ time matching bound for the query complexity of $d+1$ player cake cutting with Lipschitz utilities for any $d> 1$. When the utility functions are given by a polynomial time algorithm, we prove the problem to be PPAD-complete. For measurable utility functions, we find a fully polynomial-time algorithm for finding an approximate envy-free allocation of a cake among three people using two cuts.
We study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in âŠ
We study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in the marketplace. Each pair of agents can yield a different match value, and the planner's goal is to maximize the total value over a finite time horizon. First we study the case in which vertices arrive in an adversarial order. We provide a randomized 0.25-competitive algorithm building on a result by Feldman et al. (2009) and Lehman et al. (2006). We extend the model to the case in which departure times are drawn independently from a distribution with non-decreasing hazard rate, for which we establish a 1/8-competitive algorithm. When the arrival order is chosen uniformly at random, we show that a batching algorithm, which computes a maximum-weighted matching every (d+1) periods, is 0.279-competitive.
We study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in âŠ
We study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in the marketplace. Each pair of agents can yield a different match value, and the planner's goal is to maximize the total value over a finite time horizon. We study matching algorithms that perform well over any sequence of arrivals when there is no a priori information about the match values or arrival times. Our main contribution is a 1/4-competitive algorithm. The algorithm randomly selects a subset of agents who will wait until right before their departure to get matched, and maintains a maximum-weight matching with respect to the other agents. The primal-dual analysis of the algorithm hinges on a careful comparison between the initial dual value associated with an agent when it first arrives, and the final value after d time steps. It is also shown that no algorithm is 1/2-competitive. We extend the model to the case in which departure times are drawn i.i.d from a distribution with non-decreasing hazard rate, and establish a 1/8-competitive algorithm in this setting. Finally we show on real-world data that a modified version of our algorithm performs well in practice.
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes âŠ
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes the location of the drivers and can match newly arrived riders immediately or can wait for more riders to arrive. Unmatched riders incur a waiting cost of c per period. Furthermore, the platform can match riders and drivers irrevocably, and the cost of matching a driver to a rider is equal to the distance between them.
We study the assignment problem of objects to agents with heterogeneous preferences under distributional constraints. Each agent is associated with a publicly known type and has a private ordinal ranking âŠ
We study the assignment problem of objects to agents with heterogeneous preferences under distributional constraints. Each agent is associated with a publicly known type and has a private ordinal ranking over objects. We are interested in assigning as many agents as possible. Our first contribution is a generalization of the well-known and widely used serial dictatorship. Our mechanism maintains several desirable properties of serial dictatorship, including strategyproofness, Pareto efficiency, and computational tractability while satisfying the distributional constraints with a small error. We also propose a generalization of the probabilistic serial algorithm, which finds an ordinally efficient and envy-free assignment, and also satisfies the distributional constraints with a small error. We show, however, that no ordinally efficient and envy-free mechanism is also weakly strategyproof. Both of our algorithms assign at least the same number of students as the optimum fractional assignment.
The advents of online retailing and advertising have created new opportunities for online platforms to incorporate algorithmic techniques to improve shoppersâ experience and drive user engagement, which, in return, can âŠ
The advents of online retailing and advertising have created new opportunities for online platforms to incorporate algorithmic techniques to improve shoppersâ experience and drive user engagement, which, in return, can help with the long-term growth of these platforms, and also to help with having socially aware operations that consider fairness across different demographic groups. Motivated by the product-ranking problem in online shopping, this paper introduces and studies a new class of combinatorial optimization problems over the space of permutations, which is referred to as âsequential submodular maximization.â Using this class of problems, it provides algorithmic solutions for maximizing usersâ engagement and also for balancing the usersâ engagement across different demographic groups of shoppers to obtain fairness. In particular, they propose an optimal (1 â 1/e)-approximation algorithms for maximizing usersâ engagement and a bicriteria ((1 â 1/e) 2 , (1 â 1/e) 2 ) for maximizing usersâ engagement subject to group fairness constraints.
In this paper we provide nearly linear time algorithms for several problems closely associated with the classic Perron-Frobenius theorem, including computing Perron vectors, i.e. entrywise non-negative eigenvectors of non-negative matrices, âŠ
In this paper we provide nearly linear time algorithms for several problems closely associated with the classic Perron-Frobenius theorem, including computing Perron vectors, i.e. entrywise non-negative eigenvectors of non-negative matrices, and solving linear systems in asymmetric M-matrices, a generalization of Laplacian systems. The running times of our algorithms depend nearly linearly on the input size and polylogarithmically on the desired accuracy and problem condition number.Leveraging these results we also provide improved running times for a broader range of problems including computing random walk-based graph kernels, computing Katz centrality, and more. The running times of our algorithms improve upon previously known results which either depended polynomially on the condition number of the problem, required quadratic time, or only applied to special cases.We obtain these results by providing new iterative methods for reducing these problems to solving linear systems in Row-Column Diagonally Dominant (RCDD) matrices. Our methods are related to the classic shift-and-invert preconditioning technique for eigenvector computation and constitute the first alternative to the result in Cohen et al. (2016) for reducing stationary distribution computation and solving directed Laplacian systems to solving RCDD systems.
We introduce and study a variation of the submodular maximization problem motivated by applications in online retail. A platform displays a list of products to a user in response to âŠ
We introduce and study a variation of the submodular maximization problem motivated by applications in online retail. A platform displays a list of products to a user in response to a search query. The user inspects the first k items in the list for a k chosen at random from a given distribution, and decides whether to purchase an item from that set based on a choice model. The goal of the platform is to maximize the engagement of the shopper defined as the probability of purchase. This problem gives rise to a less-studied variation of submodular maximization in which we are asked to choose an ordering of a set of elements to maximize a linear combination of different submodular functions.
Two-Stage Matching and Pricing in Ride-Hailing Platforms Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and âŠ
Two-Stage Matching and Pricing in Ride-Hailing Platforms Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and pricing decisions by batching the demand requests. We initiate the study of the two-stage stochastic matching problem with or without pricing to enable the platform to make improved decisions in a batch with an eye toward the imminent future demand requests. This problem is motivated in part by applications in online marketplaces, such as ride-hailing platforms. We design online competitive algorithms for vertex-weighted (or unweighted) two-stage stochastic matching for maximizing supply efficiency and two-stage joint matching and pricing for maximizing market efficiency. Using various techniques, such as introducing convex programmingâbased matching and graph decompositions, submodular maximization, and factor-revealing linear programs, we obtain either optimal competitive or improved approximation algorithms compared with naĂŻve solutions. We enrich our theoretical study by data-driven numerical simulations using DiDiâs ride-sharing data sets.
Coordination games describe social or economic interactions in which the adoption of a common strategy has a higher payoff. They are classically used to model the spread of conventions, behaviors, âŠ
Coordination games describe social or economic interactions in which the adoption of a common strategy has a higher payoff. They are classically used to model the spread of conventions, behaviors, and technologies in societies. Here we consider a two-strategies coordination game played asynchronously between the nodes of a network. Agents behave according to a noisy best-response dynamics. It is known that noise removes the degeneracy among equilibria: In the long run, the ``risk-dominant'' behavior spreads throughout the network. Here we consider the problem of computing the typical time scale for the spread of this behavior. In particular, we study its dependence on the network structure and derive a dichotomy between highly-connected, non-local graphs that show slow convergence, and poorly connected, low dimensional graphs that show fast convergence.
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish âŠ
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish according to Poisson point processes. Buyers arrive similarly and make take-it-or-leave-it offers for unsold items. The objective is to maximize the (infinite) time average revenue of the seller. Our main results are pricing-based policies which (i) achieve a 1/2-approximation of the optimal offline policy, which is best possible, and (ii) achieve a better than (1-1/e)-approximation of the optimal online policy. Result (i) improves upon bounds implied by recent work of Collina et al. (WINE'20), and is the first optimal prophet inequality for a stationary problem. Result (ii) improves upon a 1-1/e bound implied by recent work of Aouad and Sarita (EC'20), and shows that this prevalent bound in online algorithms is not optimal for this problem.
In this paper, we present approximation algorithms for combinatorial optimization problems under probabilistic constraints. Specifically, we focus on stochastic variants of two important combinatorial optimization problems: the k-center problem and âŠ
In this paper, we present approximation algorithms for combinatorial optimization problems under probabilistic constraints. Specifically, we focus on stochastic variants of two important combinatorial optimization problems: the k-center problem and the set cover problem, with uncertainty characterized by a probability distribution over set of points or elements to be covered. We consider these problems under adaptive and non-adaptive settings, and present efficient approximation algorithms for the case when underlying distribution is a product distribution. In contrast to the expected cost model prevalent in stochastic optimization literature, our problem definitions support restrictions on the probability distributions of the total costs, via incorporating constraints that bound the probability with which the incurred costs may exceed a given threshold.
In this paper we provide nearly linear time algorithms for several problems closely associated with the classic Perron-Frobenius theorem, including computing Perron vectors, i.e. entrywise non-negative eigenvectors of non-negative matrices, âŠ
In this paper we provide nearly linear time algorithms for several problems closely associated with the classic Perron-Frobenius theorem, including computing Perron vectors, i.e. entrywise non-negative eigenvectors of non-negative matrices, and solving linear systems in asymmetric M-matrices, a generalization of Laplacian systems. The running times of our algorithms depend nearly linearly on the input size and polylogarithmically on the desired accuracy and problem condition number.Leveraging these results we also provide improved running times for a broader range of problems including computing random walk-based graph kernels, computing Katz centrality, and more. The running times of our algorithms improve upon previously known results which either depended polynomially on the condition number of the problem, required quadratic time, or only applied to special cases.We obtain these results by providing new iterative methods for reducing these problems to solving linear systems in Row-Column Diagonally Dominant (RCDD) matrices. Our methods are related to the classic shift-and-invert preconditioning technique for eigenvector computation and constitute the first alternative to the result in Cohen et al. (2016) for reducing stationary distribution computation and solving directed Laplacian systems to solving RCDD systems.
We study an optimization problem capturing a core operational question for online retailing platforms. Given models for the users' preferences over products as well as the number of items they âŠ
We study an optimization problem capturing a core operational question for online retailing platforms. Given models for the users' preferences over products as well as the number of items they are willing to observe before clicking on one or abandoning the search, what is the best way to rank the relevant products in response to a search query?
In order to capture both popularity and diversity effects, we model the probability that a user clicks on an element from a subset of products as a monotone submodular function of this set. We also assume that the patience level of the users, or the number of items they are willing to observe before clicking on one or abandoning the search, is a given random variable. Under those assumptions, the objective functions capturing user engagement or platform revenue can be written as a new family of submodular optimization problems over a sequence of elements.
We call this family of natural optimization problems sequential submodular optimization. By building on and extending the literature on submodular maximization subject to matroid constraints, we derive a (1-1/e) optimal approximation algorithm for maximizing user engagement and a bi-criteria approximation algorithm for maximizing revenue subject to a lower bound on user engagement.
We design a deterministic polynomial time $c^n$ approximation algorithm for the permanent of positive semidefinite matrices where $c=e^{\gamma+1}\simeq 4.84$. We write a natural convex relaxation and show that its optimum âŠ
We design a deterministic polynomial time $c^n$ approximation algorithm for the permanent of positive semidefinite matrices where $c=e^{\gamma+1}\simeq 4.84$. We write a natural convex relaxation and show that its optimum solution gives a $c^n$ approximation of the permanent. We further show that this factor is asymptotically tight by constructing a family of positive semidefinite matrices.
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] âŠ
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons.
Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their $\ell$-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.
We study random digraphs on sequences of expanders with a bounded average degree which converge locally in probability. We prove that the relative size and the threshold for the existence âŠ
We study random digraphs on sequences of expanders with a bounded average degree which converge locally in probability. We prove that the relative size and the threshold for the existence of a giant strongly connected component as well as the asymptotic fraction of nodes with giant fan-in or nodes with giant fan-out are local, in the sense that they are the same for two sequences with the same local limit. The digraph has a bow-tie structure, with all but a vanishing fraction of nodes lying either in the unique strongly connected giant and its fan-in and fan-out or in sets with small fan-in and small fan-out. All local quantities are expressed in terms of percolation on the limiting rooted graph, without any structural assumptions on the limit, allowing, in particular, for nontree-like graphs. In the course of establishing these results, we generalize previous results on the locality of the size of the giant to expanders of bounded average degree with possibly nontree-like limits. We also show that, regardless of the local convergence of a sequence, the uniqueness of the giant and convergence of its relative size for unoriented percolation imply the bow-tie structure for directed percolation. An application of our methods shows that the critical threshold for bond percolation and random digraphs on preferential attachment graphs is pc=0 with an infinite order phase transition at pc.
Several neuroimaging studies have investigated localized aberrations in brain structure, function or connectivity in late-life depression, but the ensuing results are equivocal and often conflicting. Here, we provide a quantitative âŠ
Several neuroimaging studies have investigated localized aberrations in brain structure, function or connectivity in late-life depression, but the ensuing results are equivocal and often conflicting. Here, we provide a quantitative consolidation of neuroimaging in late-life depression using coordinate-based meta-analysis by searching multiple databases and tracing the relevant references up to March 2020. Our search revealed 3252 unique records, among which we identified 32 eligible whole-brain neuroimaging publications comparing 674 late-life depression patients with 568 healthy controls. The peak coordinates of group comparisons between patients and controls were extracted and then analyzed using activation likelihood estimation method. Our sufficiently powered analysis on all the experiments, and more homogenous subsections of the data (in-/decreases, experiments using functional imaging) revealed no significant convergent regional abnormality in late-life depression. This inconsistency might be due to experimental (e.g., choice of tasks, image modalities) and analytic flexibility (e.g., preprocessing and analytic parameters), distributed patterns of neural abnormalities, and heterogeneity of clinical populations (e.g., severity of late-life depression, age of onset). Our findings highlight the need for more reproducible research by using pre-registered and standardized protocols on more homogenous populations to identify potential consistent brain abnormalities in late-life depression.
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, âŠ
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, improving upon the folklore algorithm, which maintains a maximal (and hence $2$-approximate) matching in $O(n)$ worst-case update time in $n$-node graphs. We present the first deterministic algorithm which outperforms the folklore algorithm in terms of {\em both} approximation ratio and worst-case update time. Specifically, we give a $(2-\Omega(1))$-approximate algorithm with $O(m^{3/8})=O(n^{3/4})$ worst-case update time in $n$-node, $m$-edge graphs. For sufficiently small constant $\epsilon>0$, no deterministic $(2+\epsilon)$-approximate algorithm with worst-case update time $O(n^{0.99})$ was known. Our second result is the first deterministic $(2+\epsilon)$-approximate weighted matching algorithm with $O_\epsilon(1)\cdot O(\sqrt[4]{m}) = O_\epsilon(1)\cdot O(\sqrt{n})$ worst-case update time. Our main technical contributions are threefold: first, we characterize the tight cases for \emph{kernels}, which are the well-studied matching sparsifiers underlying much of the $(2+\epsilon)$-approximate dynamic matching literature. This characterization, together with multiple ideas -- old and new -- underlies our result for breaking the approximation barrier of $2$. Our second technical contribution is the first example of a dynamic matching algorithm whose running time is improved due to improving the \emph{recourse} of other dynamic matching algorithms. Finally, we show how to use dynamic bipartite matching algorithms as black-box subroutines for dynamic matching in general graphs without incurring the natural $\frac{3}{2}$ factor in the approximation ratio which such approaches naturally incur.
The excitation-inhibition ratio is a key functional property of cortical microcircuits which changes throughout an individual's lifespan. Adolescence is considered a critical period for maturation of excitation-inhibition ratio. This has âŠ
The excitation-inhibition ratio is a key functional property of cortical microcircuits which changes throughout an individual's lifespan. Adolescence is considered a critical period for maturation of excitation-inhibition ratio. This has primarily been observed in animal studies. However, there is limited human in vivo evidence for maturation of excitation-inhibition ratio at the individual level. Here, we developed an individualized in vivo marker of regional excitation-inhibition ratio in human adolescents, estimated using large-scale simulations of biophysical network models fitted to resting-state functional imaging data from both cross-sectional (n = 752) and longitudinal (n = 149) cohorts. In both datasets, we found a widespread decrease in excitation-inhibition ratio in association areas, paralleled by an increase or lack of change in sensorimotor areas. This developmental pattern was aligned with multiscale markers of sensorimotor-association differentiation. Although our main findings were robust across alternative modeling configurations, we observed local variations, highlighting the importance of methodological choices for future studies.
In many two-sided labor markets, interviews are conducted before matches are formed. An increase in the number of interviews in the market for medical residencies raised the demand for signaling âŠ
In many two-sided labor markets, interviews are conducted before matches are formed. An increase in the number of interviews in the market for medical residencies raised the demand for signaling mechanisms, in which applicants can send a limited number of signals to communicate interest. We study the role of signaling mechanisms in reducing the number of interviews in centralized random matching markets with post-interview shocks. For the market to clear we focus on interim stability, which extends the notion of stability to ensure that agents do not regret not interviewing with each other. A matching is almost interim stable if it is interim stable after removing a vanishingly small fraction of agents. We first study signaling mechanisms in random matching markets when agents on the short side, long side, or both sides signal their top~$d$ preferred partners. Interviews graphs are formed by including all pairs where at least one party has signaled the other. We show that when $d = \omega(1)$, short-side signaling leads to almost interim stable matchings. Long-side signaling is only effective when the market is almost balanced. Conversely, when the interview shocks are negligible and $d = o(\log n)$, both-side signaling fails to achieve almost interim stability. For larger $d \ge \Omega(\log^2 n)$, short-side signaling achieves perfect interim stability, while long-side signaling fails in imbalanced markets. We build on our findings to propose a signaling mechanism for multi-tiered random markets. Our analysis identifies conditions under which signaling mechanisms are incentive compatible. A technical contribution is the analysis of a message-passing algorithm that efficiently determines interim stability and matching outcomes by leveraging local neighborhood structures.
In several two-sided markets, including labor and dating, agents typically have limited information about their preferences prior to mutual interactions. This issue can result in matching frictions, as arising in âŠ
In several two-sided markets, including labor and dating, agents typically have limited information about their preferences prior to mutual interactions. This issue can result in matching frictions, as arising in the labor market for medical residencies, where high application rates are followed by a large number of interviews. Yet, the extensive literature on two-sided matching primarily focuses on models where agents know their preferences, leaving the interactions necessary for preference discovery largely overlooked. This paper studies this problem using an algorithmic approach, extending Gale-Shapley's deferred acceptance to this context. Two algorithms are proposed. The first is an adaptive algorithm that expands upon Gale-Shapley's deferred acceptance by incorporating interviews between applicants and positions. Similar to deferred acceptance, one side sequentially proposes to the other. However, the order of proposals is carefully chosen to ensure an interim stable matching is found. Furthermore, with high probability, the number of interviews conducted by each applicant or position is limited to $O(\log^2 n)$. In many seasonal markets, interactions occur more simultaneously, consisting of an initial interview phase followed by a clearing stage. We present a non-adaptive algorithm for generating a single stage set of in tiered random markets. The algorithm finds an interim stable matching in such markets while assigning no more than $O(\log^3 n)$ interviews to each applicant or position.
Small-world networks, known for their high local clustering and short average path lengths, are a fundamental structure in many real-world systems, including social, biological, and technological networks. We apply the âŠ
Small-world networks, known for their high local clustering and short average path lengths, are a fundamental structure in many real-world systems, including social, biological, and technological networks. We apply the theory of local convergence (Benjamini-Schramm convergence) to derive the limiting behavior of the local structures for two of the most commonly studied small-world network models: the Watts-Strogatz model and the Kleinberg model. Establishing local convergence enables us to show that key network measures, such as PageRank, clustering coefficients, and maximum matching size, converge as network size increases with their limits determined by the graph's local structure. Additionally, this framework facilitates the estimation of global phenomena, such as information cascades, using local information from small neighborhoods. As an additional outcome of our results, we observe a critical change in the behavior of the limit exactly when the parameter governing long-range connections in the Kleinberg model crosses the threshold where decentralized search remains efficient, offering a new perspective on why decentralized algorithms fail in certain regimes.
We study a continuous-time, infinite-horizon dynamic matching problem. Suppliers arrive according to a Poisson process; while waiting, they may abandon the queue at a uniform rate. Customers on the other âŠ
We study a continuous-time, infinite-horizon dynamic matching problem. Suppliers arrive according to a Poisson process; while waiting, they may abandon the queue at a uniform rate. Customers on the other side of the network must be matched upon arrival. The objective is to minimize the expected long-term average cost subject to a throughput constraint on the total match rate. Previous literature on dynamic matching focuses on "static" policies, where the matching decisions do not depend explicitly on the state of the supplier queues, achieving constant-factor approximations. By contrast, we design "adaptive" policies, which leverage queue length information, and obtain near-optimal polynomial-time algorithms for several classes of instances. First, we develop a bi-criteria Fully Polynomial-time Approximation Scheme (FPTAS) for dynamic matching on networks with a constant number of queues -- that computes a $(1-\epsilon)$-approximation of the optimal policy in time polynomial in both the input size and $1/\epsilon$. Using this algorithm as a subroutine, we obtain an FPTAS for dynamic matching on Euclidean networks of fixed dimension. A key new technique is a hybrid LP relaxation, which combines static and state-dependent LP approximations of the queue dynamics, after a decomposition of the network. Constant-size networks are motivated by deceased organ donation schemes, where the supply types can be divided according to blood and tissue types. The Euclidean case is of interest in ride-hailing and spatial service platforms, where the goal is to fulfill as many trips as possible while minimizing driving distances.
We study stationary online bipartite matching, where both types of nodes--offline and online--arrive according to Poisson processes. Offline nodes wait to be matched for some random time, determined by an âŠ
We study stationary online bipartite matching, where both types of nodes--offline and online--arrive according to Poisson processes. Offline nodes wait to be matched for some random time, determined by an exponential distribution, while online nodes need to be matched immediately. This model captures scenarios such as deceased organ donation and time-sensitive task assignments, where there is an inflow of patients and workers (offline nodes) with limited patience, while organs and tasks (online nodes) must be assigned upon arrival. We present an efficient online algorithm that achieves a $(1-1/e+\delta)$-approximation to the optimal online policy's reward for a constant $\delta > 0$, simplifying and improving previous work by Aouad and Sarita\c{c} (2022). Our solution combines recent online matching techniques, particularly pivotal sampling, which enables correlated rounding of tighter linear programming approximations, and a greedy-like algorithm. A key technical component is the analysis of a stochastic process that exploits subtle correlations between offline nodes, using renewal theory. A byproduct of our result is an improvement to the best-known competitive ratio--that compares an algorithm's performance to the optimal offline policy--via a $(1-1/\sqrt{e} + \eta)$-competitive algorithm for a universal constant $\eta > 0$, advancing the results of Patel and Wajc (2024).
In tackling the challenges of large language model (LLM) performance for Text-to-SQL tasks, we introduce CHASE-SQL, a new framework that employs innovative strategies, using test-time compute in multi-agent modeling to âŠ
In tackling the challenges of large language model (LLM) performance for Text-to-SQL tasks, we introduce CHASE-SQL, a new framework that employs innovative strategies, using test-time compute in multi-agent modeling to improve candidate generation and selection. CHASE-SQL leverages LLMs' intrinsic knowledge to generate diverse and high-quality SQL candidates using different LLM generators with: (1) a divide-and-conquer method that decomposes complex queries into manageable sub-queries in a single LLM call; (2) chain-of-thought reasoning based on query execution plans, reflecting the steps a database engine takes during execution; and (3) a unique instance-aware synthetic example generation technique, which offers specific few-shot demonstrations tailored to test questions.To identify the best candidate, a selection agent is employed to rank the candidates through pairwise comparisons with a fine-tuned binary-candidates selection LLM. This selection approach has been demonstrated to be more robust over alternatives. The proposed generators-selector framework not only enhances the quality and diversity of SQL queries but also outperforms previous methods. Overall, our proposed CHASE-SQL achieves the state-of-the-art execution accuracy of 73.0% and 73.01% on the test set and development set of the notable BIRD Text-to-SQL dataset benchmark, rendering CHASE-SQL the top submission of the leaderboard (at the time of paper submission).
Given a so called ''Sperner coloring'' of a triangulation of the $D$-dimensional simplex, Sperner's lemma guarantees the existence of a rainbow simplex, i.e. a simplex colored by all $D+1$ colors. âŠ
Given a so called ''Sperner coloring'' of a triangulation of the $D$-dimensional simplex, Sperner's lemma guarantees the existence of a rainbow simplex, i.e. a simplex colored by all $D+1$ colors. However, finding a rainbow simplex was the first problem to be proven $\mathsf{PPAD}$-complete in Papadimitriou's classical paper introducing the class $\mathsf{PPAD}$ (1994). In this paper, we prove that the problem does not become easier if we relax ''all $D+1$ colors'' to allow some fraction of missing colors: in fact, for any constant $D$, finding even a simplex with just three colors remains $\mathsf{PPAD}$-complete! Our result has an interesting application for the envy-free cake cutting from fair division. It is known that if agents value pieces of cake using general continuous functions satisfying a simple boundary condition (''a non-empty piece is better than an empty piece of cake''), there exists an envy-free allocation with connected pieces. We show that for any constant number of agents it is $\mathsf{PPAD}$-complete to find an allocation -- even using any constant number of possibly disconnected pieces -- that makes just three agents envy-free. Our results extend to super-constant dimension, number of agents, and number of pieces, as long as they are asymptotically bounded by any $\log^{1-\Omega(1)}(\epsilon)$, where $\epsilon$ is the precision parameter (side length for Sperner and approximate envy-free for cake cutting).
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a âprophetâ who knows âŠ
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a âprophetâ who knows the future. An equally natural, though significantly less well-studied, benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it? Motivated by applications in ride hailing, we study the above questions for the online stochastic maximum-weight matching problem under vertex arrivals. For this problem, a number of [Formula: see text]-competitive algorithms are known. This is the best possible ratio for this problem, as it generalizes the original single-item prophet inequality problem. We present a polynomial-time algorithm, which approximates the optimal online algorithm within a factor of 0.51âstrictly more than the best-possible constant against a prophet. In contrast, we show that it is PSPACE-hard to approximate this problem within some universal constant [Formula: see text]. Funding: Financial support from the National Science Foundation [Grants CCF1763970, CCF1812919, and CCF191070], the Office of Naval Research [Grant N000141912550], and Cisco Research is gratefully acknowledged.
We study the polynomial-time approximability of the optimal online stochastic bipartite matching algorithm, initiated by Papadimitriou et al. (EC'21). Here, nodes on one side of the graph are given upfront, âŠ
We study the polynomial-time approximability of the optimal online stochastic bipartite matching algorithm, initiated by Papadimitriou et al. (EC'21). Here, nodes on one side of the graph are given upfront, while at each time $t$, an online node and its edge weights are drawn from a time-dependent distribution. The optimal algorithm is $\textsf{PSPACE}$-hard to approximate within some universal constant. We refer to this optimal algorithm, which requires time to think (compute), as a philosopher, and refer to polynomial-time online approximations of the above as philosopher inequalities. The best known philosopher inequality for online matching yields a $0.652$-approximation. In contrast, the best possible prophet inequality, or approximation of the optimum offline solution, is $0.5$. Our main results are a $0.678$-approximate algorithm and a $0.685$-approximation for a vertex-weighted special case. Notably, both bounds exceed the $0.666$-approximation of the offline optimum obtained by Tang, Wu, and Wu (STOC'22) for the vertex-weighted problem. Building on our algorithms and the recent black-box reduction of Banihashem et al. (SODA'24), we provide polytime (pricing-based) truthful mechanisms which $0.678$-approximate the social welfare of the optimal online allocation for bipartite matching markets. Our online allocation algorithm relies on the classic pivotal sampling algorithm (Srinivasan FOCS'01, Gandhi et al. J.ACM'06), along with careful discarding to obtain negative correlations between offline nodes. Consequently, the analysis boils down to examining the distribution of a weighted sum $X$ of negatively correlated Bernoulli variables, specifically lower bounding its mass below a threshold, $\mathbb{E}[\min(1,X)]$, of possible independent interest. Interestingly, our bound relies on an imaginary invocation of pivotal sampling.
We study the stochastic online metric matching problem. In this problem, $m$ servers and $n$ requests are located in a metric space, where all servers are available upfront and requests âŠ
We study the stochastic online metric matching problem. In this problem, $m$ servers and $n$ requests are located in a metric space, where all servers are available upfront and requests arrive one at a time. In particular, servers are adversarially chosen, and requests are independently drawn from a known distribution. Upon the arrival of a new request, it needs to be immediately and irrevocably matched to a free server, resulting in a cost of their distance. The objective is to minimize the total matching cost. In this paper, we show that the problem can be reduced to a more accessible setting where both servers and requests are drawn from the same distribution by incurring a moderate cost. Combining our reduction with previous techniques, for $[0, 1]^d$ with various choices of distributions, we achieve improved competitive ratios and nearly optimal regrets in both balanced and unbalanced markets. In particular, we give $O(1)$-competitive algorithms for $d \geq 3$ in both balanced and unbalanced markets with smooth distributions. Our algorithms improve on the $O((\log \log \log n)^2)$ competitive ratio of \cite{DBLP:conf/icalp/GuptaGPW19} for balanced markets in various regimes, and provide the first positive results for unbalanced markets.
We study online capacitated resource allocation, a natural generalization of online stochastic max-weight bipartite matching. This problem is motivated by ride-sharing and Internet advertising applications, where online arrivals may have âŠ
We study online capacitated resource allocation, a natural generalization of online stochastic max-weight bipartite matching. This problem is motivated by ride-sharing and Internet advertising applications, where online arrivals may have the capacity to serve multiple offline users. Our main result is a polynomial-time online algorithm which is $(1/2 + \kappa)$-approximate to the optimal online algorithm for $\kappa = 0.0115$. This can be contrasted to the (tight) $1/2$-competitive algorithms to the optimum offline benchmark from the prophet inequality literature. Optimum online is a recently popular benchmark for online Bayesian problems which can use unbounded computation, but not "prophetic" knowledge of future inputs. Our algorithm (which also works for the case of stochastic rewards) rounds a generalized LP relaxation from the unit-capacity case via a two-proposal algorithm, as in previous works in the online matching literature. A key technical challenge in deriving our guarantee is bounding the positive correlation among users introduced when rounding our LP relaxation online. Unlike in the case of unit capacities, this positive correlation is unavoidable for guarantees beyond $1/2$. Conceptually, our results show that the study of optimum online as a benchmark can reveal problem-specific insights that are irrelevant to competitive analysis.
Online Bayesian bipartite matching is a central problem in digital marketplaces and exchanges, including advertising, crowdsourcing, ridesharing, and kidney exchange. We introduce a graph neural network (GNN) approach that emulates âŠ
Online Bayesian bipartite matching is a central problem in digital marketplaces and exchanges, including advertising, crowdsourcing, ridesharing, and kidney exchange. We introduce a graph neural network (GNN) approach that emulates the problem's combinatorially-complex optimal online algorithm, which selects actions (e.g., which nodes to match) by computing each action's value-to-go (VTG) -- the expected weight of the final matching if the algorithm takes that action, then acts optimally in the future. We train a GNN to estimate VTG and show empirically that this GNN returns high-weight matchings across a variety of tasks. Moreover, we identify a common family of graph distributions in spatial crowdsourcing applications, such as rideshare, under which VTG can be efficiently approximated by aggregating information within local neighborhoods in the graphs. This structure matches the local behavior of GNNs, providing theoretical justification for our approach.
Utilizing large language models (LLMs) for transforming natural language questions into SQL queries (text-to-SQL) is a promising yet challenging approach, particularly when applied to real-world databases with complex and extensive âŠ
Utilizing large language models (LLMs) for transforming natural language questions into SQL queries (text-to-SQL) is a promising yet challenging approach, particularly when applied to real-world databases with complex and extensive schemas. In particular, effectively incorporating data catalogs and database values for SQL generation remains an obstacle, leading to suboptimal solutions. We address this problem by proposing a new pipeline that effectively retrieves relevant data and context, selects an efficient schema, and synthesizes correct and efficient SQL queries. To increase retrieval precision, our pipeline introduces a hierarchical retrieval method leveraging model-generated keywords, locality-sensitive hashing indexing, and vector databases. Additionally, we have developed an adaptive schema pruning technique that adjusts based on the complexity of the problem and the model's context size. Our approach generalizes to both frontier proprietary models like GPT-4 and open-source models such as Llama-3-70B. Through a series of ablation studies, we demonstrate the effectiveness of each component of our pipeline and its impact on the end-to-end performance. Our method achieves new state-of-the-art performance on the cross-domain challenging BIRD dataset.
Abstract While macroscale brain asymmetry and its relevance for human cognitive function have been consistently shown, the underlying neurobiological signatures remain an open question. Here, we probe layer-specific microstructural asymmetry âŠ
Abstract While macroscale brain asymmetry and its relevance for human cognitive function have been consistently shown, the underlying neurobiological signatures remain an open question. Here, we probe layer-specific microstructural asymmetry of the human cortex using intensity profiles from post-mortem cytoarchitecture. An anterior-posterior cortical pattern of left-right asymmetry was found, varying across cortical layers. A similar anterior-posterior pattern was observed using in vivo microstructural imaging, with in vivo asymmetry showing the strongest similarity with layer III. Microstructural asymmetry varied as a function of age and sex and was found to be heritable. Moreover, asymmetry in microstructure corresponded to asymmetry of intrinsic function, in particular in sensory areas. Last, probing the behavioral relevance, we found differential association of language and markers of mental health with asymmetry, illustrating a functional divergence between inferior-superior and anterior-posterior microstructural axes anchored in microstructural development. Our study highlights the layer-based patterning of microstructural asymmetry of the human cortex and its functional relevance.
This paper derives statistical guarantees for the performance of Graph Neural Networks (GNNs) in link prediction tasks on graphs generated by a graphon. We propose a linear GNN architecture (LG-GNN) âŠ
This paper derives statistical guarantees for the performance of Graph Neural Networks (GNNs) in link prediction tasks on graphs generated by a graphon. We propose a linear GNN architecture (LG-GNN) that produces consistent estimators for the underlying edge probabilities. We establish a bound on the mean squared error and give guarantees on the ability of LG-GNN to detect high-probability edges. Our guarantees hold for both sparse and dense graphs. Finally, we demonstrate some of the shortcomings of the classical GCN architecture, as well as verify our results on real and synthetic datasets.
Motivated by applications in the gig economy, we study approximation algorithms for a sequential pricing problem. The input is a bipartite graph [Formula: see text] between individuals I and jobs âŠ
Motivated by applications in the gig economy, we study approximation algorithms for a sequential pricing problem. The input is a bipartite graph [Formula: see text] between individuals I and jobs J. The platform has a value of v j for matching job j to an individual worker. In order to find a matching, the platform can consider the edges [Formula: see text] in any order and make a one-time take-it-or-leave-it offer of a price [Formula: see text] of its choosing to i for completing j. The worker accepts the offer with a known probability p ijw ; in this case, the job and the worker are irrevocably matched. What is the best way to make offers to maximize revenue and/or social welfare? The optimal algorithm is known to be NP-hard to compute (even if there is only a single job). With this in mind, we design efficient approximations to the optimal policy via a new random-order online contention resolution scheme (RO-OCRS) for matching. Our main result is a 0.456-balanced RO-OCRS in bipartite graphs and a 0.45-balanced RO-OCRS in general graphs. These algorithms improve on the recent bound of [Formula: see text] and improve on the best-known lower bounds for the correlation gap of matching, despite applying to a significantly more restrictive setting. As a consequence of our online contention resolution scheme results, we obtain a 0.456-approximate algorithm for the sequential pricing problem. We further extend our results to settings where workers can only be contacted a limited number of times and show how to achieve improved results for this problem via improved algorithms for the well-studied stochastic probing problem. Funding: This work was supported by the National Science Foundation [Grant CCF2209520] and a gift from Amazon Research.
We study random digraphs on sequences of expanders with a bounded average degree which converge locally in probability. We prove that the relative size and the threshold for the existence âŠ
We study random digraphs on sequences of expanders with a bounded average degree which converge locally in probability. We prove that the relative size and the threshold for the existence of a giant strongly connected component as well as the asymptotic fraction of nodes with giant fan-in or nodes with giant fan-out are local, in the sense that they are the same for two sequences with the same local limit. The digraph has a bow-tie structure, with all but a vanishing fraction of nodes lying either in the unique strongly connected giant and its fan-in and fan-out or in sets with small fan-in and small fan-out. All local quantities are expressed in terms of percolation on the limiting rooted graph, without any structural assumptions on the limit, allowing, in particular, for nontree-like graphs. In the course of establishing these results, we generalize previous results on the locality of the size of the giant to expanders of bounded average degree with possibly nontree-like limits. We also show that, regardless of the local convergence of a sequence, the uniqueness of the giant and convergence of its relative size for unoriented percolation imply the bow-tie structure for directed percolation. An application of our methods shows that the critical threshold for bond percolation and random digraphs on preferential attachment graphs is pc=0 with an infinite order phase transition at pc.
This paper makes three contributions to estimating the number of perfect matching in bipartite graphs. First, we prove that the popular sequential importance sampling algorithm works in polynomial time for âŠ
This paper makes three contributions to estimating the number of perfect matching in bipartite graphs. First, we prove that the popular sequential importance sampling algorithm works in polynomial time for dense bipartite graphs. More carefully, our algorithm gives a (1±ϔ)-approximation for the number of perfect matchings of a λ-dense bipartite graph, using O(n1â2λλϔâ2) samples. With size n on each side and for 1 2>λ>0, a λ-dense bipartite graph has all degrees greater than (λ+1 2)n. Second, practical applications of the algorithm requires many calls to matching algorithms. A novel preprocessing step is provided which makes significant improvements. Third, three applications are provided. The first is for counting Latin squares, the second is a practical way of computing the greedy algorithm for a card guessing game with feedback and the third is for stochastic block models. In all three examples, sequential importance sampling allows treating practical problems of reasonably large sizes.
Two-Stage Matching and Pricing in Ride-Hailing Platforms Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and âŠ
Two-Stage Matching and Pricing in Ride-Hailing Platforms Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and pricing decisions by batching the demand requests. We initiate the study of the two-stage stochastic matching problem with or without pricing to enable the platform to make improved decisions in a batch with an eye toward the imminent future demand requests. This problem is motivated in part by applications in online marketplaces, such as ride-hailing platforms. We design online competitive algorithms for vertex-weighted (or unweighted) two-stage stochastic matching for maximizing supply efficiency and two-stage joint matching and pricing for maximizing market efficiency. Using various techniques, such as introducing convex programmingâbased matching and graph decompositions, submodular maximization, and factor-revealing linear programs, we obtain either optimal competitive or improved approximation algorithms compared with naĂŻve solutions. We enrich our theoretical study by data-driven numerical simulations using DiDiâs ride-sharing data sets.
We study sublinear time algorithms for the traveling salesman problem (TSP). First, we focus on the closely related maximum path cover problem, which asks for a collection of vertex disjoint âŠ
We study sublinear time algorithms for the traveling salesman problem (TSP). First, we focus on the closely related maximum path cover problem, which asks for a collection of vertex disjoint paths that include the maximum number of edges. We show that for any fixed $\epsilon > 0$, there is an algorithm that $(1/2 - \epsilon)$-approximates the maximum path cover size of an $n$-vertex graph in $\widetilde{O}(n)$ time. This improves upon a $(3/8-\epsilon)$-approximate $\widetilde{O}(n \sqrt{n})$-time algorithm of Chen, Kannan, and Khanna [ICALP'20]. Equipped with our path cover algorithm, we give $\widetilde{O}(n)$ time algorithms that estimate the cost of graphic TSP and $(1, 2)$-TSP up to factors of $1.83$ and $(1.5+\epsilon)$, respectively. Our algorithm for graphic TSP improves over a $1.92$-approximate $\widetilde{O}(n)$ time algorithm due to [CHK ICALP'20, Behnezhad FOCS'21]. Our algorithm for $(1,2)$-TSP improves over a folklore $(1.75 + \epsilon)$-approximate $\widetilde{O}(n)$-time algorithm, as well as a $(1.625+\epsilon)$-approximate $\widetilde{O}(n\sqrt{n})$-time algorithm of [CHK ICALP'20]. Our analysis of the running time uses connections to parallel algorithms and is information-theoretically optimal up to poly log $n$ factors. Additionally, we show that our approximation guarantees for path cover and $(1,2)$-TSP hit a natural barrier: We show better approximations require better sublinear time algorithms for the well-studied maximum matching problem.
Graph Neural Networks (GNNs) are popular models for machine learning on graphs that typically follow the message-passing paradigm, whereby the feature of a node is updated recursively upon aggregating information âŠ
Graph Neural Networks (GNNs) are popular models for machine learning on graphs that typically follow the message-passing paradigm, whereby the feature of a node is updated recursively upon aggregating information over its neighbors. While exchanging messages over the input graph endows GNNs with a strong inductive bias, it can also make GNNs susceptible to over-squashing, thereby preventing them from capturing long-range interactions in the given graph. To rectify this issue, graph rewiring techniques have been proposed as a means of improving information flow by altering the graph connectivity. In this work, we identify three desiderata for graph-rewiring: (i) reduce over-squashing, (ii) respect the locality of the graph, and (iii) preserve the sparsity of the graph. We highlight fundamental trade-offs that occur between spatial and spectral rewiring techniques; while the former often satisfy (i) and (ii) but not (iii), the latter generally satisfy (i) and (iii) at the expense of (ii). We propose a novel rewiring framework that satisfies all of (i)--(iii) through a locality-aware sequence of rewiring operations. We then discuss a specific instance of such rewiring framework and validate its effectiveness on several real-world benchmarks, showing that it either matches or significantly outperforms existing rewiring approaches.
We propose a theoretical framework for training Graph Neural Networks (GNNs) on large input graphs via training on small, fixed-size sampled subgraphs. This framework is applicable to a wide range âŠ
We propose a theoretical framework for training Graph Neural Networks (GNNs) on large input graphs via training on small, fixed-size sampled subgraphs. This framework is applicable to a wide range of models, including popular sampling-based GNNs, such as GraphSAGE and FastGCN. Leveraging the theory of graph local limits, we prove that, under mild assumptions, parameters learned from training sampling-based GNNs on small samples of a large input graph are within an $\epsilon$-neighborhood of the outcome of training the same architecture on the whole graph. We derive bounds on the number of samples, the size of the graph, and the training steps required as a function of $\epsilon$. Our results give a novel theoretical understanding for using sampling in training GNNs. They also suggest that by training GNNs on small samples of the input graph, practitioners can identify and select the best models, hyperparameters, and sampling algorithms more efficiently. We empirically illustrate our results on a node classification task on large citation graphs, observing that sampling-based GNNs trained on local subgraphs 12$\times$ smaller than the original graph achieve comparable performance to those trained on the input graph.
The advents of online retailing and advertising have created new opportunities for online platforms to incorporate algorithmic techniques to improve shoppersâ experience and drive user engagement, which, in return, can âŠ
The advents of online retailing and advertising have created new opportunities for online platforms to incorporate algorithmic techniques to improve shoppersâ experience and drive user engagement, which, in return, can help with the long-term growth of these platforms, and also to help with having socially aware operations that consider fairness across different demographic groups. Motivated by the product-ranking problem in online shopping, this paper introduces and studies a new class of combinatorial optimization problems over the space of permutations, which is referred to as âsequential submodular maximization.â Using this class of problems, it provides algorithmic solutions for maximizing usersâ engagement and also for balancing the usersâ engagement across different demographic groups of shoppers to obtain fairness. In particular, they propose an optimal (1 â 1/e)-approximation algorithms for maximizing usersâ engagement and a bicriteria ((1 â 1/e) 2 , (1 â 1/e) 2 ) for maximizing usersâ engagement subject to group fairness constraints.
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes âŠ
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes the location of the drivers and can match newly arrived riders immediately or can wait for more riders to arrive. Unmatched riders incur a waiting cost of c per period. Furthermore, the platform can match riders and drivers irrevocably, and the cost of matching a driver to a rider is equal to the distance between them.
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish âŠ
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish according to Poisson point processes. Buyers arrive similarly and make take-it-or-leave-it offers for unsold items. The objective is to maximize the (infinite) time average revenue of the seller. Our main results are pricing-based policies which (i) achieve a 1/2-approximation of the optimal offline policy, which is best possible, and (ii) achieve a better than (1-1/e)-approximation of the optimal online policy. Result (i) improves upon bounds implied by recent work of Collina et al. (WINE'20), and is the first optimal prophet inequality for a stationary problem. Result (ii) improves upon a 1-1/e bound implied by recent work of Aouad and Sarita (EC'20), and shows that this prevalent bound in online algorithms is not optimal for this problem.
We introduce and study a variation of the submodular maximization problem motivated by applications in online retail. A platform displays a list of products to a user in response to âŠ
We introduce and study a variation of the submodular maximization problem motivated by applications in online retail. A platform displays a list of products to a user in response to a search query. The user inspects the first k items in the list for a k chosen at random from a given distribution, and decides whether to purchase an item from that set based on a choice model. The goal of the platform is to maximize the engagement of the shopper defined as the probability of purchase. This problem gives rise to a less-studied variation of submodular maximization in which we are asked to choose an ordering of a set of elements to maximize a linear combination of different submodular functions.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Algorithms Using Local Graph Features to Predict EpidemicsYeganeh Alimohammadi, Christian Borgs, and Amin SaberiYeganeh âŠ
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)Algorithms Using Local Graph Features to Predict EpidemicsYeganeh Alimohammadi, Christian Borgs, and Amin SaberiYeganeh Alimohammadi, Christian Borgs, and Amin Saberipp.3430 - 3451Chapter DOI:https://doi.org/10.1137/1.9781611977073.136PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract We study a simple model of epidemics where an infected node transmits the infection to its neighbors independently with probability p. This is also known as the independent cascade or Susceptible-Infected-Recovered (SIR) model with fixed recovery time. The size of an outbreak in this model is closely related to that of the giant connected component in "edge percolation", where each edge of the graph is kept independently with probability p, studied for a large class of networks including configuration model [30] and preferential attachment [15, 37]. Even though these models capture the effects of degree inhomogeneity and the role of super-spreaders in the spread of an epidemic, they only consider graphs that are locally tree like i.e. have a few or no short cycles. Some generalizations of the configuration model were suggested to capture local communities, known as household models [6], or hierarchical configuration model [48]. Here, we ask a different question: what information is needed for general networks to predict the size of an outbreak? Is it possible to make predictions by accessing the distribution of small subgraphs (or motifs)? We answer the question in the affirmative for large-set expanders with local weak limits (also known as Benjamini-Schramm limits). In particular, we show that there is an algorithm which gives a (1ââ) approximation of the probability and the final size of an outbreak by accessing a constant-size neighborhood of a constant number of nodes chosen uniformly at random. We also present corollaries of the theorem for the preferential attachment model, and study generalizations with household (or motif) structure. The latter was only known for the configuration model. Previous chapter Next chapter RelatedDetails Published:2022eISBN:978-1-61197-707-3 https://doi.org/10.1137/1.9781611977073Book Series Name:ProceedingsBook Code:PRDA22Book Pages:xvii + 3771
Motivated by applications in the gig economy, we study approximation algorithms for a \emph{sequential pricing problem}. The input is a bipartite graph $G=(I,J,E)$ between individuals $I$ and jobs $J$. The âŠ
Motivated by applications in the gig economy, we study approximation algorithms for a \emph{sequential pricing problem}. The input is a bipartite graph $G=(I,J,E)$ between individuals $I$ and jobs $J$. The platform has a value of $v_j$ for matching job $j$ to an individual worker. In order to find a matching, the platform can consider the edges $(i j) \in E$ in any order and make a one-time take-it-or-leave-it offer of a price $\pi_{ij} = w$ of its choosing to $i$ for completing $j$. The worker accepts the offer with a known probability $ p_{ijw} $; in this case the job and the worker are irrevocably matched. What is the best way to make offers to maximize revenue and/or social welfare? The optimal algorithm is known to be NP-hard to compute (even if there is only a single job). With this in mind, we design efficient approximations to the optimal policy via a new Random-Order Online Contention Resolution Scheme (RO-OCRS) for matching. Our main result is a 0.456-balanced RO-OCRS in bipartite graphs and a 0.45-balanced RO-OCRS in general graphs. These algorithms improve on the recent bound of $\frac{1}{2}(1-e^{-2})\approx 0.432$ of [BGMS21], and improve on the best known lower bounds for the correlation gap of matching, despite applying to a significantly more restrictive setting. As a consequence of our OCRS results, we obtain a $0.456$-approximate algorithm for the sequential pricing problem. We further extend our results to settings where workers can only be contacted a limited number of times, and show how to achieve improved results for this problem, via improved algorithms for the well-studied stochastic probing problem.
We study sublinear time algorithms for estimating the size of maximum matching in graphs. Our main result is a $(\frac{1}{2}+\Omega(1))$-approximation algorithm which can be implemented in $O(n^{1+\epsilon})$ time, where $n$ âŠ
We study sublinear time algorithms for estimating the size of maximum matching in graphs. Our main result is a $(\frac{1}{2}+\Omega(1))$-approximation algorithm which can be implemented in $O(n^{1+\epsilon})$ time, where $n$ is the number of vertices and the constant $\epsilon > 0$ can be made arbitrarily small. The best known lower bound for the problem is $\Omega(n)$, which holds for any constant approximation. Existing algorithms either obtain the greedy bound of $\frac{1}{2}$-approximation [Behnezhad FOCS'21], or require some assumption on the maximum degree to run in $o(n^2)$-time [Yoshida, Yamamoto, and Ito STOC'09]. We improve over these by designing a less "adaptive" augmentation algorithm for maximum matching that might be of independent interest.
Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and pricing decisions by batching the demand requests. âŠ
Matching and pricing are two critical levers in two-sided marketplaces to connect demand and supply. The platform can produce more efficient matching and pricing decisions by batching the demand requests. We initiate the study of the two-stage stochastic matching problem, with or without pricing, to enable the platform to make improved decisions in a batch with an eye toward the imminent future demand requests. This problem is motivated in part by applications in online marketplaces such as ride hailing platforms. We design online competitive algorithms for vertex-weighted (or unweighted) two-stage stochastic matching for maximizing supply efficiency, and two-stage joint matching and pricing for maximizing market efficiency. In the former problem, using a randomized primal-dual algorithm applied to a family of ``balancing'' convex programs, we obtain the optimal $3/4$ competitive ratio against the optimum offline benchmark. Using a factor revealing program and connections to submodular optimization, we improve this ratio against the optimum online benchmark to $(1-1/e+1/e^2)\approx 0.767$ for the unweighted and $0.761$ for the weighted case. In the latter problem, we design optimal $1/2$-competitive joint pricing and matching algorithm by borrowing ideas from the ex-ante prophet inequality literature. We also show an improved $(1-1/e)$-competitive algorithm for the special case of demand efficiency objective using the correlation gap of submodular functions. Finally, we complement our theoretical study by using DiDi's ride-sharing dataset for Chengdu city and numerically evaluating the performance of our proposed algorithms in practical instances of this problem.
We consider a model for repeated stochastic matching where compatibility is probabilistic, is realized the first time agents are matched, and persists in the future. Such a model has applications âŠ
We consider a model for repeated stochastic matching where compatibility is probabilistic, is realized the first time agents are matched, and persists in the future. Such a model has applications in the gig economy, kidney exchange, and mentorship matching. We ask whether adecentralized matching process can approximate the optimal online algorithm. In particular, we consider a decentralizedstable matching process where agents match with the most compatible partner who does not prefer matching with someone else, and known compatible pairs continue matching in all future rounds. We demonstrate that the above process provides a 0.316-approximation to the optimal online algorithm for matching on general graphs. We also provide a 1/7-approximation for many-to-one bipartite matching, a 1/11-approximation for capacitated matching on general graphs, and a 1/2k-approximation for forming teams of up to k agents. Our results rely on a novel coupling argument that decomposes the successful edges of the optimal online algorithm in terms of their round-by-round comparison with stable matching.
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows âŠ
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it?
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, âŠ
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, improving upon the folklore algorithm, which maintains a maximal (and hence $2$-approximate) matching in $O(n)$ worst-case update time in $n$-node graphs.
We present the first deterministic algorithm which outperforms the folklore algorithm in terms of {\em both} approximation ratio and worst-case update time. Specifically, we give a $(2-\Omega(1))$-approximate algorithm with $O(\sqrt{n}\sqrt[8]{m})=O(n^{3/4})$ worst-case update time in $n$-node, $m$-edge graphs. For sufficiently small constant $\epsilon>0$, no deterministic $(2+\epsilon)$-approximate algorithm with worst-case update time $O(n^{0.99})$ was known. Our second result is the first deterministic $(2+\epsilon)$-approximate (weighted) matching algorithm with $O_\epsilon(1)\cdot O(\sqrt[4]{m}) = O_\epsilon(1)\cdot O(\sqrt{n})$ worst-case update time.
Nearly three decades ago, Bar-Noy, Motwani and Naor showed that no online edge-coloring algorithm can edge color a graph optimally. Indeed, their work, titled greedy algorithm is optimal for on-line âŠ
Nearly three decades ago, Bar-Noy, Motwani and Naor showed that no online edge-coloring algorithm can edge color a graph optimally. Indeed, their work, titled greedy algorithm is optimal for on-line edge coloring, shows that the competitive ratio of $2$ of the naive greedy algorithm is best possible online. However, their lower bound required bounded-degree graphs, of maximum degree $\Delta = O(\log n)$, which prompted them to conjecture that better bounds are possible for higher-degree graphs. While progress has been made towards resolving this conjecture for restricted inputs and arrivals or for random arrival orders, an answer for fully general \emph{adversarial} arrivals remained elusive.
We resolve this thirty-year-old conjecture in the affirmative, presenting a $(1.9+o(1))$-competitive online edge coloring algorithm for general graphs of degree $\Delta = \omega(\log n)$ under vertex arrivals. At the core of our results, and of possible independent interest, is a new online algorithm which rounds a fractional bipartite matching $x$ online under vertex arrivals, guaranteeing that each edge $e$ is matched with probability $(1/2+c)\cdot x_e$, for a constant $c>0.027$.
INTRODUCTION: Numerous studies have reported brain alterations in behavioral variant frontotemporal dementia (bvFTD). However, they pointed to inconsistent findings. METHODS: We used a meta-analytic approach to identify the convergent structural âŠ
INTRODUCTION: Numerous studies have reported brain alterations in behavioral variant frontotemporal dementia (bvFTD). However, they pointed to inconsistent findings. METHODS: We used a meta-analytic approach to identify the convergent structural and functional brain abnormalities in bvFTD. Following the best-practice neuroimaging meta-analysis guidelines, we searched PubMed and Embase databases and performed reference tracking. Then, the coordinates of group comparisons between bvFTD and controls from 73 studies were extracted and tested for convergence using activation likelihood estimation. RESULTS: We identified convergent abnormalities in the anterior cingulate cortices, anterior insula, amygdala, paracingulate, striatum, and hippocampus. Task-based and resting-state functional connectivity pointed to the joint networks that are connected to the obtained consistent regions. Functional decoding analyses suggested associated dysfunction of emotional processing, interoception, reward processing, higher-order cognitive functions, olfactory and gustatory perceptions in bvFTD.DISCUSSION: Our findings highlighted a key role of the salience network and subcortical regions in the pathophysiology of bvFTD.
The rich literature on online Bayesian selection problems has long focused on so-called inequalities, which compare the gain of an online algorithm to that of a prophet who knows the âŠ
The rich literature on online Bayesian selection problems has long focused on so-called inequalities, which compare the gain of an online algorithm to that of a prophet who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it?
Motivated by applications in ride hailing, we study the above questions for the online stochastic maximum-weight matching problem under vertex arrivals. This problem was recently introduced by Ezra, Feldman, Gravin and Tang (EC'20), who gave a $1/2$-competitive algorithm for it. This is the best possible ratio, as this problem is a generalization of the original single-item inequality.
We present a polynomial-time algorithm which approximates optimal online within a factor of $0.51$, beating the best-possible inequality. At the core of our result are a new linear program formulation, an algorithm that tries to match the arriving vertices in two attempts, and an analysis that bounds the correlation resulting from the second attempts. In contrast, we show that it is PSPACE-hard to approximate this problem within some constant $\alpha < 1$.
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been âŠ
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial-time solvable; the exact number of arborescences can be computed by a determinant [Tutte, 1948], and sampling can be reduced to counting [Jerrum et al., 1986; Jerrum and Sinclair, 1996]. However, the classic reduction from sampling to counting seems to be inherently sequential. This raises the question of designing efficient parallel algorithms for sampling. We show that sampling arborescences can be done in RNC.
For several well-studied combinatorial structures, counting can be reduced to the computation of a determinant, which is known to be in NC [Csanky, 1975]. These include arborescences, planar graph perfect matchings, Eulerian tours in digraphs, and determinantal point processes. However, not much is known about efficient parallel sampling of these structures. Our work is a step towards resolving this mystery.
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes âŠ
We study dynamic matching in a spatial setting. Drivers are distributed at random on some interval. Riders arrive in some (possibly adversarial) order at randomly drawn points. The platform observes the location of the drivers, and can match newly arrived riders immediately, or can wait for more riders to arrive. Unmatched riders incur a waiting cost $c$ per period. The platform can match riders and drivers, irrevocably. The cost of matching a driver to a rider is equal to the distance between them. We quantify the value of slightly increasing supply. We prove that when there are $(1+\epsilon)$ drivers per rider (for any $\epsilon > 0$), the cost of matching returned by a simple greedy algorithm which pairs each arriving rider to the closest available driver is $O(\log^3(n))$, where $n$ is the number of riders. On the other hand, with equal number of drivers and riders, even the \emph{ex post} optimal matching does not have a cost less than $\Theta(\sqrt{n})$. Our results shed light on the important role of (small) excess supply in spatial matching markets.
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish âŠ
We study a continuous and infinite time horizon counterpart to the classic prophet inequality, which we term the stationary prophet inequality problem. Here, copies of a good arrive and perish according to Poisson point processes. Buyers arrive similarly and make take-it-or-leave-it offers for unsold items. The objective is to maximize the (infinite) time average revenue of the seller. Our main results are pricing-based policies which (i) achieve a $1/2$-approximation of the optimal offline policy, which is best possible, and (ii) achieve a better than $(1-1/e)$-approximation of the optimal online policy. Result (i) improves upon bounds implied by recent work of Collina et al. (WINE'20), and is the first optimal prophet inequality for a stationary problem. Result (ii) improves upon a $1-1/e$ bound implied by recent work of Aouad and Sarita\c{c} (EC'20), and shows that this prevalent bound in online algorithms is not optimal for this problem.
This paper makes three contributions to estimating the number of perfect matching in bipartite graphs. First, we prove that the popular sequential importance sampling algorithm works in polynomial time for âŠ
This paper makes three contributions to estimating the number of perfect matching in bipartite graphs. First, we prove that the popular sequential importance sampling algorithm works in polynomial time for dense bipartite graphs. More carefully, our algorithm gives a $(1\pm\epsilon)$-approximation for the number of perfect matchings of a $\lambda$-dense bipartite graph, using $O(n^{\frac{1-2\lambda}{\lambda}\epsilon^{-2}})$ samples. With size $n$ on each side and for $\frac{1}{2}>\lambda>0$, a $\lambda$-dense bipartite graph has all degrees greater than $(\lambda+\frac{1}{2})n$. Second, practical applications of the algorithm require many calls to matching algorithms. A novel preprocessing step is provided which makes significant improvements. Third, three applications are provided. The first is for counting Latin squares, the second is a practical way of computing the greedy algorithm for a card-guessing game with feedback, and the third is for stochastic block models. In all three examples, sequential importance sampling allows treating practical problems of reasonably large sizes.
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, âŠ
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, improving upon the folklore algorithm, which maintains a maximal (and hence $2$-approximate) matching in $O(n)$ worst-case update time in $n$-node graphs. We present the first deterministic algorithm which outperforms the folklore algorithm in terms of {\em both} approximation ratio and worst-case update time. Specifically, we give a $(2-\Omega(1))$-approximate algorithm with $O(m^{3/8})=O(n^{3/4})$ worst-case update time in $n$-node, $m$-edge graphs. For sufficiently small constant $\epsilon>0$, no deterministic $(2+\epsilon)$-approximate algorithm with worst-case update time $O(n^{0.99})$ was known. Our second result is the first deterministic $(2+\epsilon)$-approximate weighted matching algorithm with $O_\epsilon(1)\cdot O(\sqrt[4]{m}) = O_\epsilon(1)\cdot O(\sqrt{n})$ worst-case update time. Our main technical contributions are threefold: first, we characterize the tight cases for \emph{kernels}, which are the well-studied matching sparsifiers underlying much of the $(2+\epsilon)$-approximate dynamic matching literature. This characterization, together with multiple ideas -- old and new -- underlies our result for breaking the approximation barrier of $2$. Our second technical contribution is the first example of a dynamic matching algorithm whose running time is improved due to improving the \emph{recourse} of other dynamic matching algorithms. Finally, we show how to use dynamic bipartite matching algorithms as black-box subroutines for dynamic matching in general graphs without incurring the natural $\frac{3}{2}$ factor in the approximation ratio which such approaches naturally incur.
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows âŠ
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it? We study the above questions for the online stochastic maximum-weight matching problem under vertex arrivals. For this problem, a number of $1/2$-competitive algorithms are known against optimum offline. This is the best possible ratio for this problem, as it generalizes the original single-item prophet inequality problem. We present a polynomial-time algorithm which approximates the optimal online algorithm within a factor of $0.51$ -- beating the best-possible prophet inequality. In contrast, we show that it is PSPACE-hard to approximate this problem within some constant $\alpha < 1$.
We study random digraphs on sequences of expanders with bounded average degree {which converge locally in probability}. We prove that the threshold for the existence of a giant strongly connected âŠ
We study random digraphs on sequences of expanders with bounded average degree {which converge locally in probability}. We prove that the threshold for the existence of a giant strongly connected component, as well as the asymptotic fraction of nodes with giant fan-in or nodes with giant fan-out are local, in the sense that they are the same for two sequences with the same local limit. The digraph has a bow-tie structure, with all but a vanishing fraction of nodes lying either in the unique strongly connected giant and its fan-in and fan-out, or in sets with small fan-in and small fan-out. All local quantities are expressed in terms of percolation on the limiting rooted graph, without any structural assumptions on the limit, allowing, in particular, for non tree-like graphs. {In the course of establishing these results, we generalize previous results on the locality of the size of the giant to expanders of bounded average degree with possibly non-tree like limits. We also show that regardless of the local convergence of a sequence, the uniqueness of the giant and convergence of its relative size for unoriented percolation imply the bow-tie structure for directed percolation.} An application of our methods shows that the critical threshold for bond percolation and random digraphs on preferential attachment graphs is $p_c=0$, with an infinite order phase transition at $p_c$.
Nearly three decades ago, Bar-Noy, Motwani and Naor showed that no online edge-coloring algorithm can edge color a graph optimally. Indeed, their work, titled "the greedy algorithm is optimal for âŠ
Nearly three decades ago, Bar-Noy, Motwani and Naor showed that no online edge-coloring algorithm can edge color a graph optimally. Indeed, their work, titled "the greedy algorithm is optimal for on-line edge coloring", shows that the competitive ratio of $2$ of the na\"ive greedy algorithm is best possible online. However, their lower bound required bounded-degree graphs, of maximum degree $\Delta = O(\log n)$, which prompted them to conjecture that better bounds are possible for higher-degree graphs. While progress has been made towards resolving this conjecture for restricted inputs and arrivals or for random arrival orders, an answer for fully general \emph{adversarial} arrivals remained elusive. We resolve this thirty-year-old conjecture in the affirmative, presenting a $(1.9+o(1))$-competitive online edge coloring algorithm for general graphs of degree $\Delta = \omega(\log n)$ under vertex arrivals. At the core of our results, and of possible independent interest, is a new online algorithm which rounds a fractional bipartite matching $x$ online under vertex arrivals, guaranteeing that each edge $e$ is matched with probability $(1/2+c)\cdot x_e$, for a constant $c>0.027$.
We study a simple model of epidemics where an infected node transmits the infection to its neighbors independently with probability $p$. This is also known as the independent cascade or âŠ
We study a simple model of epidemics where an infected node transmits the infection to its neighbors independently with probability $p$. This is also known as the independent cascade or Susceptible-Infected-Recovered (SIR) model with fixed recovery time. The size of an outbreak in this model is closely related to that of the giant connected component in ``edge percolation'', where each edge of the graph is kept independently with probability $p$, studied for a large class of networks including configuration model \cite{molloy2011critical} and preferential attachment \cite{bollobas2003,Riordan2005}. Even though these models capture the effects of degree inhomogeneity and the role of super-spreaders in the spread of an epidemic, they only consider graphs that are locally tree like i.e. have a few or no short cycles. Some generalizations of the configuration model were suggested to capture local communities, known as household models \cite{ball2009threshold}, or hierarchical configuration model \cite{Hofstad2015hierarchical}. Here, we ask a different question: what information is needed for general networks to predict the size of an outbreak? Is it possible to make predictions by accessing the distribution of small subgraphs (or motifs)? We answer the question in the affirmative for large-set expanders with local weak limits (also known as Benjamini-Schramm limits). In particular, we show that there is an algorithm which gives a $(1-\epsilon)$ approximation of the probability and the final size of an outbreak by accessing a constant-size neighborhood of a constant number of nodes chosen uniformly at random. We also present corollaries of the theorem for the preferential attachment model, and study generalizations with household (or motif) structure. The latter was only known for the configuration model.
We present a polynomial-time randomized algorithm for estimating the permanent of an arbitrary n Ă n matrix with nonnegative entries. This algorithm---technically a "fully-polynomial randomized approximation scheme"---computes an approximation that âŠ
We present a polynomial-time randomized algorithm for estimating the permanent of an arbitrary n Ă n matrix with nonnegative entries. This algorithm---technically a "fully-polynomial randomized approximation scheme"---computes an approximation that is, with high probability, within arbitrarily small specified relative error of the true value of the permanent.
We consider the online stochastic matching problem proposed by Feldman et al. [Feldman J, Mehta A, Mirrokni VS, Muthukrishnan S (2009) Online stochastic matching: Beating 1 â 1/e. Annual IEEE âŠ
We consider the online stochastic matching problem proposed by Feldman et al. [Feldman J, Mehta A, Mirrokni VS, Muthukrishnan S (2009) Online stochastic matching: Beating 1 â 1/e. Annual IEEE Sympos. Foundations Comput. Sci. 117â126] as a model of display ad allocation. We are given a bipartite graph; one side of the graph corresponds to a fixed set of bins, and the other side represents the set of possible ball types. At each time step, a ball is sampled independently from the given distribution and it needs to be matched upon its arrival to an empty bin. The goal is to maximize the number of allocations. We present an online algorithm for this problem with a competitive ratio of 0.702. Before our result, algorithms with a competitive ratio better than 1 â 1/e were known under the assumption that the expected number of arriving balls of each type is integral. A key idea of the algorithm is to collect statistics about the decisions of the optimum offline solution using Monte Carlo sampling and use those statistics to guide the decisions of the online algorithm. We also show that our algorithm achieves a competitive ratio of 0.705 when the rates are integral. On the hardness side, we prove that no online algorithm can have a competitive ratio better than 0.823 under the known distribution model (and henceforth under the permutation model). This improves upon the 5/6 hardness result proved by Goel and Mehta [Goel G, Mehta A (2008) Online budgeted matching in random input models with applications to adwords. ACM-SIAM Symposium Discrete Algorithms 982â991] for the permutation model.
Summary This is a survey of results on properties of random regular graphs, together with an exposition of some of the main methods of obtaining these results. Related results on âŠ
Summary This is a survey of results on properties of random regular graphs, together with an exposition of some of the main methods of obtaining these results. Related results on asymptotic enumeration are also presented, as well as various generalisations to random graphs with given degree sequence. A major feature in this area is the small subgraph conditioning method. When applicable, this establishes a relationship between random regular graphs with uniform distribution, and non-uniform models of random regular graphs in which the probability of a graph G is weighted according to the number of subgraphs that G has of a certain type. Information can be obtained in this way on the probability of existence of various types of spanning subgraphs, such as Hamilton cycles and decompositions into perfect matchings. Uniformly distributed labelled random regular graphs receive most of the attention, but also included are several non-uniform models which come about in a natural way. Some of these appear as spin-offs from the small subgraph conditioning method, and some arise from algorithms which use simple approaches to generating random regular graphs. A quite separate role played by algorithms is in the derivation of random graph properties by analysing the performance of an appropriate greedy algorithm on a random regular graph. Many open problems and conjeetures are given.
We describe a sequential importance sampling (SIS) procedure for analyzing two-way zeroâone or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples âŠ
We describe a sequential importance sampling (SIS) procedure for analyzing two-way zeroâone or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples the columns of the table progressively according to certain special distributions. Our method produces Monte Carlo samples that are remarkably close to the uniform distribution, enabling one to approximate closely the null distributions of various test statistics about these tables. Our method compares favorably with other existing Monte Carlo-based algorithms, and sometimes is a few orders of magnitude more efficient. In particular, compared with Markov chain Monte Carlo (MCMC)-based approaches, our importance sampling method not only is more efficient in terms of absolute running time and frees one from pondering over the mixing issue, but also provides an easy and accurate estimate of the total number of tables with fixed marginal sums, which is far more difficult for an MCMC method to achieve.
We present a general framework for approximately reducing the mechanism design problem for multiple agents to single agent subproblems in the context of Bayesian combinatorial auctions. Our framework can be âŠ
We present a general framework for approximately reducing the mechanism design problem for multiple agents to single agent subproblems in the context of Bayesian combinatorial auctions. Our framework can be applied to any setting which roughly satisfies the following assumptions: (i) agents' types are distributed independently (not necessarily identically), (ii) objective function is additively separable over the agents, and (iii) there are no interagent constraints except for the supply constraints (i.e., that the total allocation of each item should not exceed the supply). Our framework is general in the sense that it makes no direct assumption about agents' valuations, type distributions, or single agent constraints (e.g., budget, incentive compatibility, etc.). We present two generic multiagent mechanisms which use single agent mechanisms as black boxes. If an $\alpha$-approximate single agent mechanism is available for each agent, and assuming no agent ever demands more than $\frac{1}{k}$ of all units of each item, our generic multiagent mechanisms are $\gamma_{k}\alpha$-approximations of the optimal multiagent mechanism, where $\gamma_{k}$ is a constant which is at least $1-\frac{1}{\sqrt{k+3}}$. As a byproduct of our construction, we present a generalization of prophet inequalities where both gambler and prophet are allowed to pick $k$ numbers each to receive a reward equal to their sum. Finally, we use our framework to obtain multiagent mechanisms with improved approximation factor for several settings from the literature.
Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex âŠ
Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
We present a practical algorithm for generating random regular graphs. For all d growing as a small power of n , the d -regular graphs on n vertices are generated âŠ
We present a practical algorithm for generating random regular graphs. For all d growing as a small power of n , the d -regular graphs on n vertices are generated approximately uniformly at random, in the sense that all d -regular graphs on n vertices have in the limit the same probability as n â â. The expected runtime for these d s is [Oscr ]( nd 2 ).
Importance sampling has been reported to produce algorithms with excellent empirical performance in counting problems. However, the theoretical support for its efficiency in these applications has been very limited. In âŠ
Importance sampling has been reported to produce algorithms with excellent empirical performance in counting problems. However, the theoretical support for its efficiency in these applications has been very limited. In this paper, we propose a methodology that can be used to design efficient importance sampling algorithms for counting and test their efficiency rigorously. We apply our techniques after transforming the problem into a rare-event simulation problemâthereby connecting complexity analysis of counting problems with efficiency in the context of rare-event simulation. As an illustration of our approach, we consider the problem of counting the number of binary tables with fixed column and row sums, cj's and ri's, respectively, and total marginal sums d=âjcj. Assuming that max jcj=o(d1/2), âcj2=O(d) and the rj's are bounded, we show that a suitable importance sampling algorithm, proposed by Chen et al. [J. Amer. Statist. Assoc. 100 (2005) 109â120], requires O(d3Éâ2ÎŽâ1) operations to produce an estimate that has É-relative error with probability 1âÎŽ. In addition, if max jcj=o(d1/4âÎŽ0) for some ÎŽ0>0, the same coverage can be guaranteed with O(d3Éâ2log(ÎŽâ1)) operations.
Let $X_i \geq 0$ be independent, $i = 1, \cdots, n$, and $X^\ast_n = \max(X_1, \cdots, X_n)$. Let $t(c) (s(c))$ be the threshold stopping rule for $X_1, \cdots, X_n$, defined âŠ
Let $X_i \geq 0$ be independent, $i = 1, \cdots, n$, and $X^\ast_n = \max(X_1, \cdots, X_n)$. Let $t(c) (s(c))$ be the threshold stopping rule for $X_1, \cdots, X_n$, defined by $t(c) = \text{smallest} i$ for which $X_i \geq c(s(c) = \text{smallest} i$ for which $X_i > c), = n$ otherwise. Let $m$ be a median of the distribution of $X^\ast_n$. It is shown that for every $n$ and $\underline{X}$ either $EX^\ast_n \leq 2EX_{t(m)}$ or $EX^\ast_n \leq 2EX_{s(m)}$. This improves previously known results, [1], [4]. Some results for i.i.d. $X_i$ are also included.
Given a sequence of nonnegative real numbers λ0, λ1, ⊠that sum to 1, we consider a random graph having approximately λin vertices of degree i. In [12] the authors âŠ
Given a sequence of nonnegative real numbers λ0, λ1, ⊠that sum to 1, we consider a random graph having approximately λin vertices of degree i. In [12] the authors essentially show that if [sum ]i(iâ2)λi>0 then the graph a.s. has a giant component, while if [sum ]i(iâ2)λi<0 then a.s. all components in the graph are small. In this paper we analyse the size of the giant component in the former case, and the structure of the graph formed by deleting that component. We determine Δ, λâČ0, λâČ1 ⊠such that a.s. the giant component, C, has Δn+o(n) vertices, and the structure of the graph remaining after deleting C is basically that of a random graph with nâČ=nâ[mid ]C[mid ] vertices, and with λâČinâČ of them of degree i.
Article Free Access Share on A random graph model for massive graphs Authors: William Aiello AT&T Labs, Florham Park, New Jersey AT&T Labs, Florham Park, New JerseyView Profile , Fan âŠ
Article Free Access Share on A random graph model for massive graphs Authors: William Aiello AT&T Labs, Florham Park, New Jersey AT&T Labs, Florham Park, New JerseyView Profile , Fan Chung University of California, San Diego University of California, San DiegoView Profile , Linyuan Lu University of California, San Diego University of California, San DiegoView Profile Authors Info & Claims STOC '00: Proceedings of the thirty-second annual ACM symposium on Theory of computingMay 2000 Pages 171â180https://doi.org/10.1145/335305.335326Published:01 May 2000Publication History 511citation3,291DownloadsMetricsTotal Citations511Total Downloads3,291Last 12 Months224Last 6 weeks30 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteeReaderPDF
We propose a random graph model which is a special case of sparserandom graphs with given degree sequences which satisfy a power law. This model involves only a small number âŠ
We propose a random graph model which is a special case of sparserandom graphs with given degree sequences which satisfy a power law. This model involves only a small number of paramo eters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. From these parameters, various properties of the graph can be derived. For example, for certai n ranges of the parameters, we wi II compute the expected distribution of the sizes of the connected components which almost surely occur with high probability. We illustrate the consistency of our model with the behavior of some massive graphs derived from data in telecommunications. We also discuss the threshold function, the giant component, and the evolution of random graphs in this model.
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows âŠ
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it?
We present algorithms for solving symmetric, diagonally-dominant linear systems to accuracy Δ in time linear in their number of non-zeros and log (Îșf (A) Δ), where Îșf (A) is the âŠ
We present algorithms for solving symmetric, diagonally-dominant linear systems to accuracy Δ in time linear in their number of non-zeros and log (Îșf (A) Δ), where Îșf (A) is the condition number of the matrix defining the linear system. Our algorithm applies the preconditioned Chebyshev iteration with preconditioners designed using nearly-linear time algorithms for graph sparsification and graph partitioning.
Abstract Given a sequence of nonnegative real numbers λ 0 , λ 1 ⊠which sum to 1, we consider random graphs having approximately λ i n vertices of degree âŠ
Abstract Given a sequence of nonnegative real numbers λ 0 , λ 1 ⊠which sum to 1, we consider random graphs having approximately λ i n vertices of degree i. Essentially, we show that if ÎŁ i(i â 2)λ i > 0, then such graphs almost surely have a giant component, while if ÎŁ i ( i â 2)λ. < 0, then almost surely all components in such graphs are small. We can apply these results to G n,p ,G n.M , and other wellâknown models of random graphs. There are also applications related to the chromatic number of sparse random graphs.
We consider the problem of efficient operation of a barter exchange platform for indivisible goods. We introduce a dynamic model of barter exchange where in each period one agent arrives âŠ
We consider the problem of efficient operation of a barter exchange platform for indivisible goods. We introduce a dynamic model of barter exchange where in each period one agent arrives with a single item she wants to exchange for a different item. We study a homogeneous and stochastic environment: an agent is interested in the item possessed by another agent with probability p, independently for all pairs of agents. We consider two settings with respect to the types of allowed exchanges: a) Only two-way cycles, in which two agents swap their items, b) Two or three-way cycles. The goal of the platform is to minimize the average waiting time of an agent.Somewhat surprisingly, we find that in each of these settings, a policy that conducts exchanges in a greedy fashion is near optimal, among a large class of policies that includes batching policies. Further, we find that for small p, allowing three-cycles can greatly improve the waiting time over the two-cycles only setting. Specifically, we find that a greedy policy achieves an average waiting time of Î(1/p2) in setting a), and Î(1/p3/2) in setting b). Thus, a platform can achieve the smallest waiting times by using a greedy policy, and by facilitating three cycles, if possible.Our findings are consistent with empirical and computational observations which compare batching policies in the context of kidney exchange programs.MSC codesbarterrandom graphsdynamicskidney exchange
We study the online stochastic bipartite matching problem, in a form motivated by display ad allocation on the Internet. In the online, but adversarial case, the celebrated result of Karp, âŠ
We study the online stochastic bipartite matching problem, in a form motivated by display ad allocation on the Internet. In the online, but adversarial case, the celebrated result of Karp, Vazirani and Vazirani gives an approximation ratio of $1-1/e$. In the online, stochastic case when nodes are drawn repeatedly from a known distribution, the greedy algorithm matches this approximation ratio, but still, no algorithm is known that beats the $1 - 1/e$ bound. Our main result is a 0.67-approximation online algorithm for stochastic bipartite matching, breaking this $1 - {1/e}$ barrier. Furthermore, we show that no online algorithm can produce a $1-Δ$ approximation for an arbitrarily small $Δ$ for this problem. We employ a novel application of the idea of the power of two choices from load balancing: we compute two disjoint solutions to the expected instance, and use both of them in the online algorithm in a prescribed preference order. To identify these two disjoint solutions, we solve a max flow problem in a boosted flow graph, and then carefully decompose this maximum flow to two edge-disjoint (near-)matchings. These two offline solutions are used to characterize an upper bound for the optimum in any scenario. This is done by identifying a cut whose value we can bound under the arrival distribution.
Abstract Upper and lower bounds are given for P ( S †k ), 0 †k †ES , where S is a sum of indicator variables with a âŠ
Abstract Upper and lower bounds are given for P ( S †k ), 0 †k †ES , where S is a sum of indicator variables with a special structure, which appears, for example, in subgraph counts in random graphs. in typical cases, these bounds are close to the corresponding probabilities for a Poisson distribution with the same mean as S . There are no corresponding general bounds for P ( S ℠k ), k > ES , but some partial results are given.
In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to âŠ
In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to different types of structural constraints. The focus of this paper is on the case where the allowable subsets of served customers are characterized by a laminar matroid with constant depth. This problem is a special case of the well-known matroid Bayesian online selection problem studied in [Kleinberg & Weinberg, 2012], when the underlying matroid is laminar. We give the first Polynomial-Time Approximation Scheme (PTAS) for the above problem. Our approach is based on rounding the solution of a hierarchy of linear programming relaxations that can approximate the optimum online solution with any degree of accuracy as well as a concentration argument that shows our rounding does not have a considerable loss in the expected social welfare.
We discuss the following problem: given a random sample $\mathbf{X} = (X_1, X_2, \cdots, X_n)$ from an unknown probability distribution $F$, estimate the sampling distribution of some prespecified random variable âŠ
We discuss the following problem: given a random sample $\mathbf{X} = (X_1, X_2, \cdots, X_n)$ from an unknown probability distribution $F$, estimate the sampling distribution of some prespecified random variable $R(\mathbf{X}, F)$, on the basis of the observed data $\mathbf{x}$. (Standard jackknife theory gives an approximate mean and variance in the case $R(\mathbf{X}, F) = \theta(\hat{F}) - \theta(F), \theta$ some parameter of interest.) A general method, called the "bootstrap," is introduced, and shown to work satisfactorily on a variety of estimation problems. The jackknife is shown to be a linear approximation method for the bootstrap. The exposition proceeds by a series of examples: variance of the sample median, error rates in a linear discriminant analysis, ratio estimation, estimating regression parameters, etc.
Suppose that a process begins with n isolated vertices, to which edges are added randomly one by one so that the maximum degree of the induced graph is always bounded âŠ
Suppose that a process begins with n isolated vertices, to which edges are added randomly one by one so that the maximum degree of the induced graph is always bounded above by d . We prove that if n â â with d fixed, then with probability tending to 1, the final result of this process is a graph with â nd / 2â edges.
We consider the problem of maximizing a nonnegative submodular set function $f:2^N \rightarrow {\mathbb R}_+$ over a ground set $N$ subject to a variety of packing-type constraints including (multiple) matroid âŠ
We consider the problem of maximizing a nonnegative submodular set function $f:2^N \rightarrow {\mathbb R}_+$ over a ground set $N$ subject to a variety of packing-type constraints including (multiple) matroid constraints, knapsack constraints, and their intersections. In this paper we develop a general framework that allows us to derive a number of new results, in particular, when $f$ may be a nonmonotone function. Our algorithms are based on (approximately) maximizing the multilinear extension $F$ of $f$ over a polytope $P$ that represents the constraints, and then effectively rounding the fractional solution. Although this approach has been used quite successfully, it has been limited in some important ways. We overcome these limitations as follows. First, we give constant factor approximation algorithms to maximize $F$ over a downward-closed polytope $P$ described by an efficient separation oracle. Previously this was known only for monotone functions. For nonmonotone functions, a constant factor was known only when the polytope was either the intersection of a fixed number of knapsack constraints or a matroid polytope. Second, we show that contention resolution schemes are an effective way to round a fractional solution, even when $f$ is nonmonotone. In particular, contention resolution schemes for different polytopes can be combined to handle the intersection of different constraints. Via linear programming duality we show that a contention resolution scheme for a constraint is related to the correlation gap of weighted rank functions of the constraint. This leads to an optimal contention resolution scheme for the matroid polytope. Our results provide a broadly applicable framework for maximizing linear and submodular functions subject to independence constraints. We give several illustrative examples. Contention resolution schemes may find other applications.
Random variables, $X_1, \cdots, X_k$ are said to be negatively associated (NA) if for every pair of disjoint subsets $A_1, A_2$ of $\{1, 2, \cdots, k\}, \operatorname{Cov}\lbrack f(X_1, i \in âŠ
Random variables, $X_1, \cdots, X_k$ are said to be negatively associated (NA) if for every pair of disjoint subsets $A_1, A_2$ of $\{1, 2, \cdots, k\}, \operatorname{Cov}\lbrack f(X_1, i \in A_1), g(X_j, j \in A_2) \rbrack \leq 0$, for all nondecreasing functions $f, g$. The basic properties of negative association are derived. Especially useful is the property that nondecreasing functions of mutually exclusive subsets of NA random variables are NA. This property is shown not to hold for several other types of negative dependence recently proposed. One consequence is the inequality $P(X_i \leq x_i, i = 1, \cdots, k) \leq \prod^k_1P(X_i \leq x_i)$ for NA random variables $X_1, \cdots, X_k$, and the dual inequality resulting from reversing the inequalities inside the square brackets. In another application it is shown that negatively correlated normal random variables are NA. Other NA distributions are the (a) multinomial, (b) convolution of unlike multinomials, (c) multivariate hypergeometric, (d) Dirichlet, and (e) Dirichlet compound multinomial. Negative association is shown to arise in situations where the probability measure is permutation invariant. Applications of this are considered for sampling without replacement as well as for certain multiple ranking and selection procedures. In a somewhat striking example, NA and positive association representing quite strong opposing types of dependence, are shown to exist side by side in models of categorical data analysis.
Suppose that $G_j$ is a sequence of finite connected planar graphs, and in each $G_j$ a special vertex, called the root, is chosen randomly-uniformly. We introduce the notion of a âŠ
Suppose that $G_j$ is a sequence of finite connected planar graphs, and in each $G_j$ a special vertex, called the root, is chosen randomly-uniformly. We introduce the notion of a distributional limit $G$ of such graphs. Assume that the vertex degrees of the vertices in $G_j$ are bounded, and the bound does not depend on $j$. Then after passing to a subsequence, the limit exists, and is a random rooted graph $G$. We prove that with probability one $G$ is recurrent. The proof involves the Circle Packing Theorem. The motivation for this work comes from the theory of random spherical triangulations.
Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help âŠ
Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Random graph theory is used to examine the "small-world phenomenon"â any two strangers are connected through a short chain of mutual acquaintances. We will show that for certain families of âŠ
Random graph theory is used to examine the "small-world phenomenon"â any two strangers are connected through a short chain of mutual acquaintances. We will show that for certain families of random graphs with given expected degrees, the average distance is almost surely of order log _n_/ log_dÌ_ where _dÌ_ is the weighted average of the sum of squares of the expected degrees. Of particular interest are power law random graphs in which the number of vertices of degree _k_ is proportional to 1/_k_<sup>ÎČ</sup> for some fixed exponent _ÎČ_. For the case of _ÎČ_ > 3, we prove that the average distance of the power law graphs is almost surely of order log _n_/ log _dÌ_. However, many Internet, social, and citation networks are power law graphs with exponents in the range 2 < ÎČ < 3 for which the power law random graphs have average distance almost surely of order log log _n_, but have diameter of order log _n_ (provided having some mild constraints for the average distance and maximum degree). In particular, these graphs contain a dense subgraph, that we call the core, having _n_<sup> <em>c</em>/ log log <em>n</em> </sup> vertices. Almost all vertices are within distance log log _n_ of the core although there are vertices at distance log _n_ from the core.
Let $\mathcal{Q}$ be a monotone decreasing property of graphs $G$ on $n$ vertices. ErdĆs, Suen and Winkler [5] introduced the following natural way of choosing a random maximal graph in âŠ
Let $\mathcal{Q}$ be a monotone decreasing property of graphs $G$ on $n$ vertices. ErdĆs, Suen and Winkler [5] introduced the following natural way of choosing a random maximal graph in $\mathcal{Q}$: start with $G$ the empty graph on $n$ vertices. Add edges to $G$ one at a time, each time choosing uniformly from all $e\in G^c$ such that $G+e\in \mathcal{Q}$. Stop when there are no such edges, so the graph $G_\infty$ reached is maximal in $\mathcal{Q}$. ErdĆs, Suen and Winkler asked how many edges the resulting graph typically has, giving good bounds for $\mathcal{Q}=\{$bipartite graphs$\}$ and $\mathcal{Q}=\{$triangle free graphs$\}$. We answer this question for $C_4$-free graphs and for $K_4$-free graphs, by considering a related question about standard random graphs $G_p\in \mathcal{G}(n,p)$. The main technique we use is the 'step by step' approach of [3]. We wish to show that $G_p$ has a certain property with high probability. For example, for $K_4$ free graphs the property is that every 'large' set $V$ of vertices contains a triangle not sharing an edge with any $K_4$ in $G_p$. We would like to apply a standard Martingale inequality, but the complicated dependence involved is not of the right form. Instead we examine $G_p$ one step at a time in such a way that the dependence on what has gone before can be split into 'positive' and 'negative' parts, using the notions of up-sets and down-sets. The relatively simple positive part is then estimated directly. The much more complicated negative part can simply be ignored, as shown in [3].
We present a randomized algorithm that on input a symmetric, weakly diagonally dominant $n$-by-$n$ matrix $A$ with $m$ nonzero entries and an $n$-vector $b$ produces an ${\tilde{x}} $ such that âŠ
We present a randomized algorithm that on input a symmetric, weakly diagonally dominant $n$-by-$n$ matrix $A$ with $m$ nonzero entries and an $n$-vector $b$ produces an ${\tilde{x}} $ such that $\|{\tilde{x}} - A^{\dagger} {b} \|_{A} \leq \epsilon \|A^{\dagger} {b}\|_{A}$ in expected time $O (m \log^{c}n \log (1/\epsilon))$ for some constant $c$. By applying this algorithm inside the inverse power method, we compute approximate Fiedler vectors in a similar amount of time. The algorithm applies subgraph preconditioners in a recursive fashion. These preconditioners improve upon the subgraph preconditioners first introduced by Vaidya in 1990. For any symmetric, weakly diagonally dominant matrix $A$ with nonpositive off-diagonal entries and $k \geq 1$, we construct in time $O (m \log^{c} n)$ a preconditioner $B$ of $A$ with at most $2 (n - 1) + O ((m/k) \log^{39} n)$ nonzero off-diagonal entries such that the finite generalized condition number $\kappa_{f} (A,B)$ is at most $k$, for some other constant $c$. In the special case when the nonzero structure of the matrix is planar the corresponding linear system solver runs in expected time $O (n \log^{2} n + n \log n \ \log \log n \ \log (1/\epsilon))$. We hope that our introduction of algorithms of low asymptotic complexity will lead to the development of algorithms that are also fast in practice.