Author Description

Nima Anari is a computer scientist and mathematician known for his work in algorithmic combinatorics and probability. He is an assistant professor of computer science at Stanford University, where his research focuses on the design and analysis of algorithms, often drawing on tools from combinatorics, probability, and geometry. His contributions include advances in understanding the geometry of polynomials in algorithmic contexts, as well as methods for counting and sampling combinatorial structures. Before joining Stanford, he completed his PhD at the University of Washington and conducted postdoctoral research at institutions including Microsoft Research.

Ask a Question About This Mathematician

We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model … We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model of any matroid in the regime where 0<q<1. Consequently, we can sample random spanning forests in a graph and estimate the reliability polynomial of any matroid. We also prove the thirty year old conjecture of Mihail and Vazirani that the bases exchange graph of any matroid has edge expansion at least 1.
In this paper we consider a mechanism design problem in the context of large-scale crowdsourcing markets such as Amazon's Mechanical Turk mturk, ClickWorker clickworker, CrowdFlower crowdflower. In these markets, there … In this paper we consider a mechanism design problem in the context of large-scale crowdsourcing markets such as Amazon's Mechanical Turk mturk, ClickWorker clickworker, CrowdFlower crowdflower. In these markets, there is a requester who wants to hire workers to accomplish some tasks. Each worker is assumed to give some utility to the requester on getting hired. Moreover each worker has a minimum cost that he wants to get paid for getting hired. This minimum cost is assumed to be private information of the workers. The question then is -- if the requester has a limited budget, how to design a direct revelation mechanism that picks the right set of workers to hire in order to maximize the requester's utility? We note that although the previous work (Singer (2010) chen et al. (2011)) has studied this problem, a crucial difference in which we deviate from earlier work is the notion of large-scale markets that we introduce in our model. Without the large market assumption, it is known that no mechanism can achieve a competitive ratio better than 0.414 and 0.5 for deterministic and randomized mechanisms respectively (while the best known deterministic and randomized mechanisms achieve an approximation ratio of 0.292 and 0.33 respectively). In this paper, we design a budget-feasible mechanism for large markets that achieves a competitive ratio of 1 - 1/e ≃ 0.63. Our mechanism can be seen as a generalization of an alternate way to look at the proportional share mechanism, which is used in all the previous works so far on this problem. Interestingly, we can also show that our mechanism is optimal by showing that no truthful mechanism can achieve a factor better than 1 - 1/e, thus, fully resolving this setting. Finally we consider the more general case of submodular utility functions and give new and improved mechanisms for the case when the market is large.
Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy the strongest form of negative dependence properties. We show that the “natural” Monte Carlo Markov Chain … Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy the strongest form of negative dependence properties. We show that the “natural” Monte Carlo Markov Chain (MCMC) algorithm mixes rapidly in the support of a homogeneous strongly Rayleigh distribution. As a byproduct, our proof implies Markov chains can be used to efficiently generate approximate samples of a k-determinantal point process. This answers an open question raised by Deshpande and Rademacher (2010) which was studied recently by Kang (2013); Li et al. (2015); Rebeschini and Karbasi (2015).
We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution … We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if a distribution is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes [KM17]; [DK17]; [KO18]; [AL20], this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from the given distribution. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm [Wei06] for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics [LV97]; [LV99]; [DG00]; [Vig01]; [Eft+16].
A polynomial pΕℝ[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial … A polynomial pΕℝ[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial of a real stable polynomial p with nonnegative coefficients. This fundamental inequality has been used to attack several counting and optimization problems. Here, we study a more general question: Given a stable multilinear polynomial p with nonnegative coefficients and a set of monomials S, we show that if the polynomial obtained by summing up all monomials in S is real stable, then we can lower bound the sum of coefficients of monomials of p that are in S. We also prove generalizations of this theorem to (real stable) polynomials that are not multilinear. We use our theorem to give a new proof of Schrijver's inequality on the number of perfect matchings of a regular bipartite graph, generalize a recent result of Nikolov and Singh, and give deterministic polynomial time approximation algorithms for several counting problems.
We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem is polyloglog(n). In other words, there is a polynomial time algorithm that approximates … We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem is polyloglog(n). In other words, there is a polynomial time algorithm that approximates the value of the optimum tour within a factor of polyloglog(n), where polyloglog(n) is a bounded degree polynomial of loglog(n). We prove this by showing that any k-edge-connected unweighted graph has a polyloglog(n)/k-thin spanning tree. Our main new ingredient is a procedure, albeit an exponentially sized convex program, that "transforms" graphs that do not admit any spectrally thin trees into those that provably have spectrally thin trees. More precisely, given a k-edge-connected graph G = (V, E) where k ≥ 7 log(n), we show that there is a matrix D that "preserves" the structure of all cuts of G such that for a set F ⊆ E that induces an Ω(k)-edge-connected graph, the effective resistance of every edge in F w.r.t. D is at most polylog(k)/k. Then, we use our extension of the seminal work of Marcus, Spielman, and Srivastava [1], fully explained in [2], to prove the existence of a polylog(k)/k-spectrally thin tree with respect to D. Such a tree is polylog(k)/k-combinatorially thin with respect to G as D preserves the structure of cuts of G.
We give a deterministic polynomial time $2^{O(r)}$-approximation algorithm for the number of bases of a given matroid of rank $r$ and the number of common bases of any two matroids … We give a deterministic polynomial time $2^{O(r)}$-approximation algorithm for the number of bases of a given matroid of rank $r$ and the number of common bases of any two matroids of rank $r$. To the best of our knowledge, this is the first nontrivial deterministic approximation algorithm that works for arbitrary matroids. Based on a lower bound of Azar, Broder, and Frieze [ABF94] this is almost the best possible result assuming oracle access to independent sets of the matroid. There are two main ingredients in our result: For the first, we build upon recent results of Adiprasito, Huh, and Katz [AHK15] and Huh and Wang [HW17] on combinatorial hodge theory to derive a connection between matroids and log-concave polynomials. We expect that several new applications in approximation algorithms will be derived from this connection in future. Formally, we prove that the multivariate generating polynomial of the bases of any matroid is log-concave as a function over the positive orthant. For the second ingredient, we develop a general framework for approximate counting in discrete problems, based on convex optimization. The connection goes through subadditivity of the entropy. For matroids, we prove that an approximate superadditivity of the entropy holds by relying on the log-concavity of the corresponding polynomials.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We give a self-contained proof of the strongest version of Mason’s conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra … We give a self-contained proof of the strongest version of Mason’s conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra log-concave. To do this, we introduce a class of polynomials, called completely log-concave polynomials, whose bivariate restrictions have ultra log-concave coefficients. At the heart of our proof we show that for any matroid, the homogenization of the generating polynomial of its independent sets is completely log-concave.
We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model … We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model of any matroid in the regime where $0\lt q\lt 1$. Consequently, we can sample random spanning forests in a graph and estimate the reliability polynomial of any matroid. We also prove the thirty year old conjecture of Mihail and Vazirani that the bases exchange graph of any matroid has edge expansion at least $1$. Our algorithm and proof build on the recent results of Dinur, Kaufman, Mass and Oppenheim who show that a high-dimensional walk on a weighted simplicial complex mixes rapidly if for every link of the complex, the corresponding localized random walk on the $1$-skeleton is a strong spectral expander. One of our key observations is that a weighted simplicial complex $X$ is a $0$-local spectral expander if and only if a naturally associated generating polynomial $p_X$ is strongly log-concave. More generally, to every pure simplicial complex $X$ with positive weights on its maximal faces, we can associate a multiaffine homogeneous polynomial $p_X$ such that the eigenvalues of the localized random walks on $X$ correspond to the eigenvalues of the Hessian of derivatives of $p_X$.
Marcus, Spielman, and Srivastava in their seminal work \cite{MSS13} resolved the Kadison-Singer conjecture by proving that for any set of finitely supported independently distributed random vectors $v_1,\dots, v_n$ which have … Marcus, Spielman, and Srivastava in their seminal work \cite{MSS13} resolved the Kadison-Singer conjecture by proving that for any set of finitely supported independently distributed random vectors $v_1,\dots, v_n$ which have "small" expected squared norm and are in isotropic position (in expectation), there is a positive probability that the sum $\sum v_i v_i^\intercal$ has small spectral norm. Their proof crucially employs real stability of polynomials which is the natural generalization of real-rootedness to multivariate polynomials. Strongly Rayleigh distributions are families of probability distributions whose generating polynomials are real stable \cite{BBL09}. As independent distributions are just special cases of strongly Rayleigh measures, it is a natural question to see if the main theorem of \cite{MSS13} can be extended to families of vectors assigned to the elements of a strongly Rayleigh distribution. In this paper we answer this question affirmatively; we show that for any homogeneous strongly Rayleigh distribution where the marginal probabilities are upper bounded by $ε_1$ and any isotropic set of vectors assigned to the underlying elements whose norms are at most $\sqrt{ε_2}$, there is a set in the support of the distribution such that the spectral norm of the sum of the natural quadratic forms of the vectors assigned to the elements of the set is at most $O(ε_1+ε_2)$. We employ our theorem to provide a sufficient condition for the existence of spectrally thin trees. This, together with a recent work of the authors \cite{AO14}, provides an improved upper bound on the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem.
Data collection and labeling is one of the main challenges in employing machine learning algorithms in a variety of real-world applications with limited data. While active learning methods attempt to … Data collection and labeling is one of the main challenges in employing machine learning algorithms in a variety of real-world applications with limited data. While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel. Batch active learning methods attempt to overcome this computational burden by querying batches of samples at a time. To avoid redundancy between samples, previous works rely on some ad hoc combination of sample quality and diversity. In this paper, we present a new principled batch active learning method using Determinantal Point Processes, a repulsive point process that enables generating diverse batches of samples. We develop tractable algorithms to approximate the mode of a DPP distribution, and provide theoretical guarantees on the degree of approximation. We further demonstrate that an iterative greedy method for DPP maximization, which has lower computational costs but worse theoretical guarantees, still gives competitive results for batch active learning. Our experiments show the value of our methods on several datasets against state-of-the-art baselines.
We give a deterministic polynomial-time 2O(r)-approximation algorithm for the number of bases of a given matroid of rank r and the number of common bases of any two matroids of … We give a deterministic polynomial-time 2O(r)-approximation algorithm for the number of bases of a given matroid of rank r and the number of common bases of any two matroids of rank r. To the best of our knowledge, this is the first nontrivial deterministic approximation algorithm that works for arbitrary matroids. Based on a lower bound of Azar, Broder, and Frieze, this is almost the best possible result assuming oracle access to independent sets of the matroid. There are two main ingredients in our result. For the first, we build upon recent results of Adiprasito, Huh, Katz, and Wang on combinatorial Hodge theory to show that the basis generating polynomial of any matroid is a (completely) log-concave polynomial. Formally, we prove that the multivariate generating polynomial of the bases of any matroid is (and all of its directional derivatives along the positive orthant are) log-concave as functions over the positive orthant. For the second ingredient, we develop a general framework for approximate counting in discrete problems, based on convex optimization. The connection goes through subadditivity of the entropy. For matroids, we prove that an approximate superadditivity of the entropy holds by relying on the log-concavity of the basis generating polynomial.
We design a deterministic polynomial time cn approximation algorithm for the permanent of positive semidefinite matrices where c = <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">γ+1</sup> ≃ 4:84. We write a natural convex relaxation … We design a deterministic polynomial time cn approximation algorithm for the permanent of positive semidefinite matrices where c = <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">γ+1</sup> ≃ 4:84. We write a natural convex relaxation and show that its optimum solution gives a cn approximation of the permanent. We further show that this factor is asymptotically tight by constructing a family of positive semidefinite matrices. We also show that our result implies an approximate version of the permanent-ontop conjecture, which was recently refuted in its original form; we show that the permanent is within a cn factor of the top eigenvalue of the Schur power matrix.
In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to … In the Bayesian online selection problem, the goal is to find a pricing algorithm for serving a sequence of arriving buyers that maximizes the expected social-welfare (or revenue) subject to different types of structural constraints. The focus of this paper is on the case where the allowable subsets of served customers are characterized by a laminar matroid with constant depth. This problem is a special case of the well-known matroid Bayesian online selection problem studied in [Kleinberg & Weinberg, 2012], when the underlying matroid is laminar. We give the first Polynomial-Time Approximation Scheme (PTAS) for the above problem. Our approach is based on rounding the solution of a hierarchy of linear programming relaxations that can approximate the optimum online solution with any degree of accuracy as well as a concentration argument that shows our rounding does not have a considerable loss in the expected social welfare.
Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy strongest form of negative dependence properties. We show that the "natural" Monte Carlo Markov Chain (MCMC) … Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy strongest form of negative dependence properties. We show that the "natural" Monte Carlo Markov Chain (MCMC) is rapidly mixing in the support of a {\em homogeneous} strongly Rayleigh distribution. As a byproduct, our proof implies Markov chains can be used to efficiently generate approximate samples of a $k$-determinantal point process. This answers an open question raised by Deshpande and Rademacher.
We study the problem of allocating $m$ items to $n$ agents subject to maximizing the Nash social welfare (NSW) objective. We write a novel convex programming relaxation for this problem, … We study the problem of allocating $m$ items to $n$ agents subject to maximizing the Nash social welfare (NSW) objective. We write a novel convex programming relaxation for this problem, and we show that a simple randomized rounding algorithm gives a $1/e$ approximation factor of the objective. Our main technical contribution is an extension of Gurvits's lower bound on the coefficient of the square-free monomial of a degree $m$-homogeneous stable polynomial on $m$ variables to all homogeneous polynomials. We use this extension to analyze the expected welfare of the allocation returned by our randomized rounding algorithm.
Data collection and labeling is one of the main challenges in employing machine learning algorithms in a variety of real-world applications with limited data. While active learning methods attempt to … Data collection and labeling is one of the main challenges in employing machine learning algorithms in a variety of real-world applications with limited data. While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel. Batch active learning methods attempt to overcome this computational burden by querying batches of samples at a time. To avoid redundancy between samples, previous works rely on some ad hoc combination of sample quality and diversity. In this paper, we present a new principled batch active learning method using Determinantal Point Processes, a repulsive point process that enables generating diverse batches of samples. We develop tractable algorithms to approximate the mode of a DPP distribution, and provide theoretical guarantees on the degree of approximation. We further demonstrate that an iterative greedy method for DPP maximization, which has lower computational costs but worse theoretical guarantees, still gives competitive results for batch active learning. Our experiments show the value of our methods on several datasets against state-of-the-art baselines.
Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three … Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three decades, ever since the discovery of RNC matching algorithms. Within this question, the case of planar graphs has remained an enigma: On the one hand, counting the number of perfect matchings is far harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution. In this paper, we give an NC algorithm for finding a perfect matching in a planar graph. Our algorithm uses the above-stated fact about counting matchings in a crucial way. Our main new idea is an NC algorithm for finding a face of the perfect matching polytope at which many new conditions, involving constraints of the polytope, are simultaneously satisfied. Several other ideas are also needed, such as finding a point in the interior of the minimum weight face of this polytope and finding a balanced tight odd set in NC.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with log-concave polynomials of homogeneous degree k on n variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over size-k subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a kO(k)-approximation of maxS µ(S) when µ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog2 n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k1 and k2 in time O(n log2 n), for fixed parameters k1, k2.
Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 18 September 2020Accepted: 26 March 2021Published online: 12 July 2021Keywordsapproximate counting, Markov chain … Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 18 September 2020Accepted: 26 March 2021Published online: 12 July 2021Keywordsapproximate counting, Markov chain Monte Carlo, Glauber dynamics, spectral independence, high-dimensional expanders, correlation decayAMS Subject Headings60J10, 68Q87, 68W20Publication DataISSN (print): 0097-5397ISSN (online): 1095-7111Publisher: Society for Industrial and Applied MathematicsCODEN: smjcat
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] … We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons. Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their l-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.
We prove that the permanent of nonnegative matrices can be deterministically approximated within a factor of √2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> in polynomial time, improving upon the previous deterministic approximations. We … We prove that the permanent of nonnegative matrices can be deterministically approximated within a factor of √2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> in polynomial time, improving upon the previous deterministic approximations. We show this by proving that the Bethe approximation of the permanent, a quantity computable in polynomial time, is at least as large as the permanent divided by √2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> . This resolves a conjecture of Gurvits [21]. Our bound is tight, and when combined with previously known inequalities lower bounding the permanent, fully resolves the quality of Bethe approximation for permanent.
We introduce a notion called entropic independence for distributions $\mu$ defined on pure simplicial complexes, i.e., subsets of size $k$ of a ground set of elements. Informally, we call a … We introduce a notion called entropic independence for distributions $\mu$ defined on pure simplicial complexes, i.e., subsets of size $k$ of a ground set of elements. Informally, we call a background measure $\mu$ entropically independent if for any (possibly randomly chosen) set $S$, the relative entropy of an element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$, a constant multiple of its ``share of entropy.'' Entropic independence is the natural analog of spectral independence, another recently established notion, if one replaces variance by entropy. In our main result, we show that $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ can be upper bounded by its linear tangent, a property implied by concavity of the said transformation. We further show that this concavity is equivalent to spectral independence under arbitrary external fields, an assumption that also goes by the name of fractional log-concavity. Our result can be seen as a new tool to establish entropy contraction from the much simpler variance contraction inequalities. A key differentiating feature of our result is that we make no assumptions on marginals of $\mu$ or the degrees of the underlying graphical model when $\mu$ is based on one. We leverage our results to derive tight modified log-Sobolev inequalities for multi-step down-up walks on fractionally log-concave distributions. As our main application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models with interaction matrix of operator norm smaller than $1$, improving upon the prior quadratic dependence on $n$.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ on a ground set of $n$ elements, or more generally distributions associated with log-concave polynomials of homogeneous degree $k$ on $n$ variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time $O(k\log k)$. Our bound has no dependence on $n$ or the starting point, unlike the previous analyses [ALOV19, CGM19], and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. Additionally, we show how to leverage down-up random walks to approximately sample spanning trees in a graph with $n$ edges in time $O(n\log^2 n)$, improving on the almost-linear time algorithm by Schild [Sch18]. Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time.
Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three … Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three decades, ever since the discovery of RNC perfect matching algorithms. Within this question, the case of planar graphs has remained an enigma: On the one hand, counting the number of perfect matchings is far harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution. In this article, we give an NC algorithm for finding a perfect matching in a planar graph. Our algorithm uses the above-stated fact about counting perfect matchings in a crucial way. Our main new idea is an NC algorithm for finding a face of the perfect matching polytope at which a set (which could be polynomially large) of conditions, involving constraints of the polytope, are simultaneously satisfied. Several other ideas are also needed, such as finding, in NC, a point in the interior of the minimum-weight face of this polytope and finding a balanced tight odd set.
$ \def\vecc#1{\boldsymbol{#1}} $We design a polynomial time algorithm that for any weighted undirected graph $G = (V, E,\vecc w)$ and sufficiently large $δ&gt; 1$, partitions $V$ into subsets $V_1, \ldots, … $ \def\vecc#1{\boldsymbol{#1}} $We design a polynomial time algorithm that for any weighted undirected graph $G = (V, E,\vecc w)$ and sufficiently large $δ&gt; 1$, partitions $V$ into subsets $V_1, \ldots, V_h$ for some $h\geq 1$, such that $\bullet$ at most $δ^{-1}$ fraction of the weights are between clusters, i.e. \[ w(E - \cup_{i = 1}^h E(V_i)) \lesssim \frac{w(E)}δ;\] $\bullet$ the effective resistance diameter of each of the induced subgraphs $G[V_i]$ is at most $δ^3$ times the average weighted degree, i.e. \[ \max_{u, v \in V_i} \mathsf{Reff}_{G[V_i]}(u, v) \lesssim δ^3 \cdot \frac{|V|}{w(E)} \quad \text{ for all } i=1, \ldots, h.\] In particular, it is possible to remove one percent of weight of edges of any given graph such that each of the resulting connected components has effective resistance diameter at most the inverse of the average weighted degree. Our proof is based on a new connection between effective resistance and low conductance sets. We show that if the effective resistance between two vertices $u$ and $v$ is large, then there must be a low conductance cut separating $u$ from $v$. This implies that very mildly expanding graphs have constant effective resistance diameter. We believe that this connection could be of independent interest in algorithm design.
We define a notion of isotropy for discrete set distributions. If μ is a distribution over subsets S of a ground set [ n], we say that μ is in … We define a notion of isotropy for discrete set distributions. If μ is a distribution over subsets S of a ground set [ n], we say that μ is in isotropic position if \mathbbPS ~ μ[e ∈ S] is the same for all e ∈ [n]. We design a new approximate sampling algorithm that leverages isotropy for the class of distributions μ that have a log-concave generating polynomial; this class includes determinantal point processes, strongly Rayleigh distributions, and uniform distributions over matroid bases. We show that when μ is in approximately isotropic position, the running time of our algorithm depends polynomially on the size of the set S, and only logarithmically on n. When n is much larger than the size of S, this is significantly faster than prior algorithms, and can even be sublinear in n. We then show how to transform a non-isotropic μ into an equivalent approximately isotropic form with a polynomial-time pre-processing step, accelerating subsequent sampling times. The main new ingredient enabling our algorithms is a class of negative dependence inequalities that may be of independent interest. As an application of our results, we show how to approximately count bases of a matroid of rank k over a ground set of n elements to within a factor of 1+ε in time O((n+1/ε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) ·poly(k,logn)). This is the first algorithm that runs in nearly linear time for fixed rank k, and achieves an inverse polynomially low approximation error. The full version of this paper is available at: https://arxiv.org/abs/2004.09079.
Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward … Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward function. This assumption does not always hold including in settings where multiple experts provide data or when a single expert provides data for different tasks -- we thus go beyond learning a unimodal reward and focus on learning a multimodal reward function. We formulate the multimodal reward learning as a mixture learning problem and develop a novel ranking-based learning approach, where the experts are only required to rank a given set of trajectories. Furthermore, as access to interaction data is often expensive in robotics, we develop an active querying approach to accelerate the learning process. We conduct experiments and user studies using a multi-task variant of OpenAI's LunarLander and a real Fetch robot, where we collect data from multiple users with different preferences. The results suggest that our approach can efficiently learn multimodal reward functions, and improve data-efficiency over benchmark methods that we adapt to our learning problem.
In this paper we consider a mechanism design problem in the context of large-scale crowdsourcing markets such as Amazon's Mechanical Turk, ClickWorker, CrowdFlower. In these markets, there is a requester … In this paper we consider a mechanism design problem in the context of large-scale crowdsourcing markets such as Amazon's Mechanical Turk, ClickWorker, CrowdFlower. In these markets, there is a requester who wants to hire workers to accomplish some tasks. Each worker is assumed to give some utility to the requester. Moreover each worker has a minimum cost that he wants to get paid for getting hired. This minimum cost is assumed to be private information of the workers. The question then is - if the requester has a limited budget, how to design a direct revelation mechanism that picks the right set of workers to hire in order to maximize the requester's utility. We note that although the previous work has studied this problem, a crucial difference in which we deviate from earlier work is the notion of large-scale markets that we introduce in our model. Without the large market assumption, it is known that no mechanism can achieve an approximation factor better than 0.414 and 0.5 for deterministic and randomized mechanisms respectively (while the best known deterministic and randomized mechanisms achieve an approximation ratio of 0.292 and 0.33 respectively). In this paper, we design a budget-feasible mechanism for large markets that achieves an approximation factor of 1-1/e (i.e. almost 0.63). Our mechanism can be seen as a generalization of an alternate way to look at the proportional share mechanism which is used in all the previous works so far on this problem. Interestingly, we also show that our mechanism is optimal by showing that no truthful mechanism can achieve a factor better than 1-1/e; thus, fully resolving this setting. Finally we consider the more general case of submodular utility functions and give new and improved mechanisms for the case when the markets are large.
We design a polynomial time algorithm that for any weighted undirected graph G = (V, E, w) and sufficiently large \delta > 1, partitions V into subsets V(1),..., V(h) for … We design a polynomial time algorithm that for any weighted undirected graph G = (V, E, w) and sufficiently large \delta > 1, partitions V into subsets V(1),..., V(h) for some h>= 1, such that at most \delta^{-1} fraction of the weights are between clusters, i.e. sum(i < j) |E(V(i), V(j)| < w(E)/\delta and the effective resistance diameter of each of the induced subgraphs G[V(i)] is at most \delta^3 times the inverse of the average weighted degree, i.e. max{ Reff(u, v) : u, v \in V(i)} < \delta^3 · |V|/w(E) for all i = 1,..., h. In particular, it is possible to remove one percent of weight of edges of any given graph such that each of the resulting connected components has effective resistance diameter at most the inverse of the average weighted degree. Our proof is based on a new connection between effective resistance and low conductance sets. We show that if the effective resistance between two vertices u and v is large, then there must be a low conductance cut separating u from v. This implies that very mildly expanding graphs have constant effective resistance diameter. We believe that this connection could be of independent interest in algorithm design.
We say a probability distribution $\mu$ is spectrally independent if an associated correlation matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove … We say a probability distribution $\mu$ is spectrally independent if an associated correlation matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if $\mu$ is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes \cite{KM17,DK17,KO18,AL19}, this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from $\mu$. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm \cite{Wei06} for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics \cite{LV97,LV99,DG00,Vig01,EHSVY16}.
We give a self-contained proof of the strongest version of Mason's conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra … We give a self-contained proof of the strongest version of Mason's conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra log-concave. To do this, we introduce a class of polynomials, called completely log-concave polynomials, whose bivariate restrictions have ultra log-concave coefficients. At the heart of our proof we show that for any matroid, the homogenization of the generating polynomial of its independent sets is completely log-concave.
In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing … In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d.\ samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $\exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.
A polynomial p∈R[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial … A polynomial p∈R[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial of a real stable polynomial p with nonnegative coefficients. This fundamental inequality has been used to attack several counting and optimization problems. Here, we study a more general question: Given a stable multilinear polynomial p with nonnegative coefficients and a set of monomials S, we show that if the polynomial obtained by summing up all monomials in S is real stable, then we can lowerbound the sum of coefficients of monomials of p that are in S. We also prove generalizations of this theorem to (real stable) polynomials that are not multilinear. We use our theorem to give a new proof of Schrijver's inequality on the number of perfect matchings of a regular bipartite graph, generalize a recent result of Nikolov and Singh [23], and give deterministic polynomial time approximation algorithms for several counting problems.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ on a ground set of $n$ elements, or more generally distributions associated with log-concave polynomials of homogeneous degree $k$ on $n$ variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time $O(k\log k)$. Our bound has no dependence on $n$ or the starting point, unlike the previous analyses [ALOV19,CGM19], and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given function $\mu: {[n] \choose k} \to \mathbb{R}_{\geq 0},$ our approximate exchange property implies that a simple local search algorithm gives a $k^{O(k)}$-approximation of $\max_{S} \mu(S)$ when $\mu$ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when $\mu$ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with $n$ edges in time $O(n\log^2 n).$ The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by \cite{ALOV19, CGM19}. For spanning tree, we improve on the almost-linear time algorithm by [Sch18]. Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems.
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of … We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. To derive our results, we relate entropic independence to properties of polynomials: $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions. As our flagship application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than $1$, improving upon the prior quadratic dependence on $n$.
Is matching in NC, i.e., is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in TCS for over three decades, ever since the … Is matching in NC, i.e., is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in TCS for over three decades, ever since the discovery of randomized NC matching algorithms [KUW85, MVV87]. Over the last five years, the theoretical computer science community has launched a relentless attack on this question, leading to the discovery of several powerful ideas. We give what appears to be the culmination of this line of work: An NC algorithm for finding a minimum-weight perfect matching in a general graph with polynomially bounded edge weights, provided it is given an oracle for the decision problem. Consequently, for settling the main open problem, it suffices to obtain an NC algorithm for the decision problem. We believe this new fact has qualitatively changed the nature of this open problem. All known efficient matching algorithms for general graphs follow one of two approaches: given by Edmonds [Edm65] and Lovász [Lov79]. Our oracle-based algorithm follows a new approach and uses many of the ideas discovered in the last five years. The difficulty of obtaining an NC perfect matching algorithm led researchers to study matching vis-a-vis clever relaxations of the class NC. In this vein, recently Goldwasser and Grossman [GG15] gave a pseudo-deterministic RNC algorithm for finding a perfect matching in a bipartite graph, i.e., an RNC algorithm with the additional requirement that on the same graph, it should return the same (i.e., unique) perfect matching for almost all choices of random bits. A corollary of our reduction is an analogous algorithm for general graphs.
Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. While these models have been quite popular, the solutions Constrained submodular … Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. While these models have been quite popular, the solutions Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. While these models have been quite popular, the solutions obtained via this approach are unstable to perturbations in data defining the submodular functions. Robust submodular maximization has been proposed as a richer model that aims to overcome this discrepancy as well as increase the modeling scope of submodular optimization. In this work, we consider robust submodular maximization with structured combinatorial constraints and give efficient algorithms with provable guarantees. Our approach is applicable to constraints defined by single or multiple matroids, knapsack as well as distributionally robust criteria. We consider both the offline setting where the data defining the problem is known in advance as well as the online setting where the input data is revealed over time. For the offline setting, we give a general (nearly) optimal bi-criteria approximation algorithm that relies on new extensions of classical algorithms for submodular maximization. For the online version of the problem, we give an algorithm that returns a bi-criteria solution with sub-linear regret.
A polynomial $p\in\mathbb{R}[z_1,\dots,z_n]$ is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the $z_1z_2\dots z_n$ … A polynomial $p\in\mathbb{R}[z_1,\dots,z_n]$ is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the $z_1z_2\dots z_n$ monomial of a real stable polynomial $p$ with nonnegative coefficients. This fundamental inequality has been used to attack several counting and optimization problems. Here, we study a more general question: Given a stable multilinear polynomial $p$ with nonnegative coefficients and a set of monomials $S$, we show that if the polynomial obtained by summing up all monomials in $S$ is real stable, then we can lowerbound the sum of coefficients of monomials of $p$ that are in $S$. We also prove generalizations of this theorem to (real stable) polynomials that are not multilinear. We use our theorem to give a new proof of Schrijver's inequality on the number of perfect matchings of a regular bipartite graph, generalize a recent result of Nikolov and Singh, and give deterministic polynomial time approximation algorithms for several counting problems.
We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative … We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to variance contraction, a.k.a. a Poincare inequality, which is significantly easier to establish through, e.g., spectral analysis. Motivated by this observation, we study restricted modified log-Sobolev inequalities that guarantee entropy contraction not for all starting distributions, but for those in a large neighborhood of the stationary distribution. We show how to sample from the hardcore and Ising models on $n$-node graphs that have a constant $\delta$ relative gap to the tree-uniqueness threshold, in nearly-linear time $\widetilde O_{\delta}(n)$. Notably, our bound does not depend on the maximum degree $\Delta$, and is therefore optimal even for high-degree graphs. This improves on prior mixing time bounds of $\widetilde O_{\delta, \Delta}(n)$ and $\widetilde O_{\delta}(n^2)$, established via (non-restricted) modified log-Sobolev and Poincare inequalities respectively. We further show that optimal concentration inequalities can still be achieved from the restricted form of modified log-Sobolev inequalities. To establish restricted entropy contraction, we extend the entropic independence framework of Anari, Jain, Koehler, Pham, and Vuong to the setting of distributions that are spectrally independent under a restricted set of external fields. We also develop an orthogonal trick that might be of independent interest: utilizing Bernoulli factories we show how to implement Glauber dynamics updates on high-degree graphs in $O(1)$ time, assuming standard adjacency array representation of the graph.
Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing … Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 14.6s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.
Strongly Rayleigh distributions are generalizations of product and determinantal probability distributions and satisfy strongest form of negative dependence properties. We show that the natural Monte Carlo Markov Chain (MCMC) is … Strongly Rayleigh distributions are generalizations of product and determinantal probability distributions and satisfy strongest form of negative dependence properties. We show that the natural Monte Carlo Markov Chain (MCMC) is rapidly mixing in the support of a {\em homogeneous} strongly Rayleigh distribution. As a byproduct, our proof implies Markov chains can be used to efficiently generate approximate samples of a $k$-determinantal point process. This answers an open question raised by Deshpande and Rademacher.
We design a deterministic polynomial time $c^n$ approximation algorithm for the permanent of positive semidefinite matrices where $c=e^{\gamma+1}\simeq 4.84$. We write a natural convex relaxation and show that its optimum … We design a deterministic polynomial time $c^n$ approximation algorithm for the permanent of positive semidefinite matrices where $c=e^{\gamma+1}\simeq 4.84$. We write a natural convex relaxation and show that its optimum solution gives a $c^n$ approximation of the permanent. We further show that this factor is asymptotically tight by constructing a family of positive semidefinite matrices.
Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three … Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three decades, ever since the discovery of RNC matching algorithms. Within this question, the case of planar graphs has remained an enigma: On the one hand, counting the number of perfect matchings is far harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution. In this paper, we give an NC algorithm for finding a perfect matching in a planar graph. Our algorithm uses the above-stated fact about counting matchings in a crucial way. Our main new idea is an NC algorithm for finding a face of the perfect matching polytope at which $\Omega(n)$ new conditions, involving constraints of the polytope, are simultaneously satisfied. Several other ideas are also needed, such as finding a point in the interior of the minimum weight face of this polytope and finding a balanced tight odd set in NC.
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] … We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons. Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their $\ell$-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.
Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. Although these models have been quite popular, the solutions obtained via … Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. Although these models have been quite popular, the solutions obtained via this approach are unstable to perturbations in data defining the submodular functions. Robust submodular maximization has been proposed as a richer model that aims to overcome this discrepancy as well as increase the modeling scope of submodular optimization. In this work, we consider robust submodular maximization with structured combinatorial constraints and give efficient algorithms with provable guarantees. Our approach is applicable to constraints defined by single or multiple matroids and knapsack as well as distributionally robust criteria. We consider both the offline setting where the data defining the problem are known in advance and the online setting where the input data are revealed over time. For the offline setting, we give a general (nearly) optimal bicriteria approximation algorithm that relies on new extensions of classical algorithms for submodular maximization. For the online version of the problem, we give an algorithm that returns a bicriteria solution with sublinear regret. Summary of Contribution: Constrained submodular maximization is one of the core areas in combinatorial optimization with a wide variety of applications in operations research and computer science. Over the last decades, both communities have been interested on the design and analysis of new algorithms with provable guarantees. Sensor location, influence maximization and data summarization are some of the applications of submodular optimization that lie at the intersection of the aforementioned communities. Particularly, our work focuses on optimizing several submodular functions simultaneously. We provide new insights and algorithms to the offline and online variants of the problem which significantly expand the related literature. At the same time, we provide a computational study that supports our theoretical results.
We study the problem of approximating the largest root of a real-rooted polynomial of degree n using its top k coefficients and give nearly matching upper and lower bounds. We … We study the problem of approximating the largest root of a real-rooted polynomial of degree n using its top k coefficients and give nearly matching upper and lower bounds. We present algorithms with running time polynomial in k that use the top k coefficients to approximate the maximum root within a factor of n1/k and when k ≤ log n and k > log n respectively. We also prove corresponding information-theoretic lower bounds of nΩ(1/k) and , and show strong lower bounds for noisy version of the problem in which one is given access to approximate coefficients.This problem has applications in the context of the method of interlacing families of polynomials, which was used for proving the existence of Ramanujan graphs of all degrees, the solution of the Kadison-Singer problem, and bounding the integrality gap of the asymmetric traveling salesman problem. All of these involve computing the maximum root of certain real-rooted polynomials for which the top few coefficients are accessible in subexponential time. Our results yield an algorithm with the running time of for all of them.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by [Jer87] to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by [ALO20], providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this extends a classic result established by [Gar59] who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We show how to use parallelization to speed up sampling from an arbitrary distribution $\mu$ on a product space $[q]^n$, given oracle access to counting queries: $\mathbb{P}_{X\sim \mu}[X_S=\sigma_S]$ for any … We show how to use parallelization to speed up sampling from an arbitrary distribution $\mu$ on a product space $[q]^n$, given oracle access to counting queries: $\mathbb{P}_{X\sim \mu}[X_S=\sigma_S]$ for any $S\subseteq [n]$ and $\sigma_S \in [q]^S$. Our algorithm takes $O({n^{2/3}\cdot \operatorname{polylog}(n,q)})$ parallel time, to the best of our knowledge, the first sublinear in $n$ runtime for arbitrary distributions. Our results have implications for sampling in autoregressive models. Our algorithm directly works with an equivalent oracle that answers conditional marginal queries $\mathbb{P}_{X\sim \mu}[X_i=\sigma_i\;\vert\; X_S=\sigma_S]$, whose role is played by a trained neural network in autoregressive models. This suggests a roughly $n^{1/3}$-factor speedup is possible for sampling in any-order autoregressive models. We complement our positive result by showing a lower bound of $\widetilde{\Omega}(n^{1/3})$ for the runtime of any parallel sampling algorithm making at most $\operatorname{poly}(n)$ queries to the counting oracle, even for $q=2$.
We show how to use parallelization to speed up sampling from an arbitrary distribution µ on a product space [q]n, given oracle access to counting queries: ℙX∼ µ[XS=σS] for any … We show how to use parallelization to speed up sampling from an arbitrary distribution µ on a product space [q]n, given oracle access to counting queries: ℙX∼ µ[XS=σS] for any S⊆ [n] and σS ∈ [q]S. Our algorithm takes O(n2/3· polylog(n,q)) parallel time, to the best of our knowledge, the first sublinear in n runtime for arbitrary distributions. Our results have implications for sampling in autoregressive models. Our algorithm directly works with an equivalent oracle that answers conditional marginal queries ℙX∼ µ[Xi=σi | XS=σS], whose role is played by a trained neural network in autoregressive models. This suggests a roughly n1/3-factor speedup is possible for sampling in any-order autoregressive models. We complement our positive result by showing a lower bound of Ω(n1/3) for the runtime of any parallel sampling algorithm making at most poly(n) queries to the counting oracle, even for q=2.
Trickle-down is a phenomenon in high-dimensional expanders with many important applications — for example, it is a key ingredient in various constructions of high-dimensional expanders or the proof of rapid … Trickle-down is a phenomenon in high-dimensional expanders with many important applications — for example, it is a key ingredient in various constructions of high-dimensional expanders or the proof of rapid mixing for the basis exchange walk on matroids and in the analysis of log-concave polynomials. We formulate a generalized trickle-down equation in the abstract context of linear-tilt localization schemes. Building on this generalization, we improve the best-known results for several Markov chain mixing or sampling problems — for example, we improve the threshold up to which Glauber dynamics is known to mix rapidly in the Sherrington-Kirkpatrick spin glass model. Other applications of our framework include near-linear time sampling algorithms from the antiferromagnetic Ising model and the fixed magnetization (antiferromagnetic or ferromagnetic) Ising model on expanders. For this application, we use a new dynamics inspired by polarization, a technique from the theory of stable polynomials.
We give a self-contained proof of the strongest version of Mason’s conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra … We give a self-contained proof of the strongest version of Mason’s conjecture, namely that for any matroid the sequence of the number of independent sets of given sizes is ultra log-concave. To do this, we introduce a class of polynomials, called completely log-concave polynomials, whose bivariate restrictions have ultra log-concave coefficients. At the heart of our proof we show that for any matroid, the homogenization of the generating polynomial of its independent sets is completely log-concave.
Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly … Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly employed in preference-based learning to generate more informative data at the expense of parallelization and computation time. In this article, we develop a set of novel algorithms, batch active preference-based learning methods, that enable efficient learning of reward functions using as few data samples as possible while still having short query generation times and also retaining parallelizability. We introduce a method based on determinantal point processes for active batch generation and several heuristic-based alternatives. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We showcase one of our algorithms in a study to learn human users’ preferences.
Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly … Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly employed in preference-based learning to generate more informative data at the expense of parallelization and computation time. In this paper, we develop a set of novel algorithms, batch active preference-based learning methods, that enable efficient learning of reward functions using as few data samples as possible while still having short query generation times and also retaining parallelizability. We introduce a method based on determinantal point processes (DPP) for active batch generation and several heuristic-based alternatives. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We showcase one of our algorithms in a study to learn human users' preferences.
We study Glauber dynamics for sampling from discrete distributions μ on the hypercube {±1}n. Recently, techniques based on spectral independence have successfully yielded optimal O(n) relaxation times for a host … We study Glauber dynamics for sampling from discrete distributions μ on the hypercube {±1}n. Recently, techniques based on spectral independence have successfully yielded optimal O(n) relaxation times for a host of different distributions μ. We show that spectral independence is universal: a relaxation time of O(n) implies spectral independence.
We show how to sample in parallel from a distribution $\pi$ over $\mathbb R^d$ that satisfies a log-Sobolev inequality and has a smooth log-density, by parallelizing the Langevin (resp. underdamped … We show how to sample in parallel from a distribution $\pi$ over $\mathbb R^d$ that satisfies a log-Sobolev inequality and has a smooth log-density, by parallelizing the Langevin (resp. underdamped Langevin) algorithms. We show that our algorithm outputs samples from a distribution $\hat\pi$ that is close to $\pi$ in Kullback--Leibler (KL) divergence (resp. total variation (TV) distance), while using only $\log(d)^{O(1)}$ parallel rounds and $\widetilde{O}(d)$ (resp. $\widetilde O(\sqrt d)$) gradient evaluations in total. This constitutes the first parallel sampling algorithms with TV distance guarantees. For our main application, we show how to combine the TV distance guarantees of our algorithms with prior works and obtain RNC sampling-to-counting reductions for families of discrete distribution on the hypercube $\{\pm 1\}^n$ that are closed under exponential tilts and have bounded covariance. Consequently, we obtain an RNC sampler for directed Eulerian tours and asymmetric determinantal point processes, resolving open questions raised in prior works.
We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model … We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model of any matroid in the regime where $0\lt q\lt 1$. Consequently, we can sample random spanning forests in a graph and estimate the reliability polynomial of any matroid. We also prove the thirty year old conjecture of Mihail and Vazirani that the bases exchange graph of any matroid has edge expansion at least $1$. Our algorithm and proof build on the recent results of Dinur, Kaufman, Mass and Oppenheim who show that a high-dimensional walk on a weighted simplicial complex mixes rapidly if for every link of the complex, the corresponding localized random walk on the $1$-skeleton is a strong spectral expander. One of our key observations is that a weighted simplicial complex $X$ is a $0$-local spectral expander if and only if a naturally associated generating polynomial $p_X$ is strongly log-concave. More generally, to every pure simplicial complex $X$ with positive weights on its maximal faces, we can associate a multiaffine homogeneous polynomial $p_X$ such that the eigenvalues of the localized random walks on $X$ correspond to the eigenvalues of the Hessian of derivatives of $p_X$.
Suppose that we have $n$ agents and $n$ items which lie in a shared metric space. We would like to match the agents to items such that the total distance … Suppose that we have $n$ agents and $n$ items which lie in a shared metric space. We would like to match the agents to items such that the total distance from agents to their matched items is as small as possible. However, instead of having direct access to distances in the metric, we only have each agent's ranking of the items in order of distance. Given this limited information, what is the minimum possible worst-case approximation ratio (known as the distortion) that a matching mechanism can guarantee? Previous work by Caragiannis et al. proved that the (deterministic) Serial Dictatorship mechanism has distortion at most $2^n - 1$. We improve this by providing a simple deterministic mechanism that has distortion $O(n^2)$. We also provide the first nontrivial lower bound on this problem, showing that any matching mechanism (deterministic or randomized) must have worst-case distortion $\Omega(\log n)$. In addition to these new bounds, we show that a large class of truthful mechanisms derived from Deferred Acceptance all have worst-case distortion at least $2^n - 1$, and we find an intriguing connection between thin matchings (analogous to the well-known thin trees conjecture) and the distortion gap between deterministic and randomized mechanisms.
We study the problem of parallelizing sampling from distributions related to determinants: symmetric, nonsymmetric, and partition-constrained determinantal point processes, as well as planar perfect matchings. For these distributions, the partition … We study the problem of parallelizing sampling from distributions related to determinants: symmetric, nonsymmetric, and partition-constrained determinantal point processes, as well as planar perfect matchings. For these distributions, the partition function, a.k.a. the count, can be obtained via matrix determinants, a highly parallelizable computation; Csanky proved it is in NC. However, parallel counting does not automatically translate to parallel sampling, as classic reductions between the two are inherently sequential. We show that a nearly quadratic parallel speedup over sequential sampling can be achieved for all the aforementioned distributions. If the distribution is supported on subsets of size k of a ground set, we show how to approximately produce a sample in Õ (k1 over 2 + c) time with polynomially many processors for any constant c > 0. In the two special cases of symmetric determinantal point processes and planar perfect matchings, our bound improves to Õ(√ k) and we show how to sample exactly in these cases.
Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing … Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 14.6s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.
We study Glauber dynamics for sampling from discrete distributions $\mu$ on the hypercube $\{\pm 1\}^n$. Recently, techniques based on spectral independence have successfully yielded optimal $O(n)$ relaxation times for a … We study Glauber dynamics for sampling from discrete distributions $\mu$ on the hypercube $\{\pm 1\}^n$. Recently, techniques based on spectral independence have successfully yielded optimal $O(n)$ relaxation times for a host of different distributions $\mu$. We show that spectral independence is universal: a relaxation time of $O(n)$ implies spectral independence. We then study a notion of tractability for $\mu$, defined in terms of smoothness of the multilinear extension of its Hamiltonian -- $\log \mu$ -- over $[-1,+1]^n$. We show that Glauber dynamics has relaxation time $O(n)$ for such $\mu$, and using the universality of spectral independence, we conclude that these distributions are also fractionally log-concave and consequently satisfy modified log-Sobolev inequalities. We sharpen our estimates and obtain approximate tensorization of entropy and the optimal $\widetilde{O}(n)$ mixing time for random Hamiltonians, i.e. the classically studied mixed $p$-spin model at sufficiently high temperature. These results have significant downstream consequences for concentration of measure, statistical testing, and learning.
We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include as special cases random spanning tree distributions and determinantal point processes. For a graph $G=(V,\ E)$, we … We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include as special cases random spanning tree distributions and determinantal point processes. For a graph $G=(V,\ E)$, we show how to approximately sample uniformly random spanning trees from G in $O(|V|)$ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> time per sample after an initial $O(|E|)$ time preprocessing. This is the first nearly-linear runtime in the output size, which is clearly optimal. For a determinantal point process on k-sized subsets of a ground set of n elements, defined via an $n\times n$ kernel matrix, we show how to approximately sample in ${\widetilde{O}}(k^{\omega})$ time after an initial ${\widetilde{O}}(nk^{\omega-1})$ time preprocessing, where $\omega\lt 2.372864$ is the matrix multiplication exponent. The time to compute just the weight of the output set is simply $\simeq k^{\omega}$, a natural barrier that suggests our runtime might be optimal for determinantal point processes as well. As a corollary, we even improve the state of the art for obtaining a single sample from a determinantal point process, from the prior runtime of ${\widetilde{O}}(\min\{nk^{2},\ n^{\omega}\})$ to ${\widetilde{O}}(nk^{\omega-1})$.In our main technical result, we achieve the optimal limit on domain sparsification for strongly Rayleigh distributions. In domain sparsification, sampling from a distribution $\mu$ on $\binom{[n]}{k}$ is reduced to sampling from related distributions on $\binom{[t]}{k}$ for $t\ll n$. We show that for strongly Rayleigh distributions, the domain size can be reduced to nearly linear in the output size $t={\widetilde{O}}(k)$, improving the state of the art from $t={\widetilde{O}}(k^{2})$ for general strongly Rayleigh distributions and the more specialized $t={\widetilde{O}}(k^{15})$ for sBanning tree distributions. Our reduction involves sampling from ${\widetilde{O}}(1)$ domain-sparsified distributions, all of which can be produced efficiently assuming approximate overestimates for marginals of $\mu$ are known and stored in a convenient data structure. Having access to marginals is the discrete analog of having access to the mean and covariance of a continuous distribution, or equivalently knowing "isotropy" for the distribution, the key behind optimal samplers in the continuous setting based on the famous Kannan-Lovász-Simonovits (KLS) conjecture. We view our result as analogous in spirit to the KLS conjecture and its consequences for sampling, but rather for discrete strongly Rayleigh measures. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> Throughout, ${\widetilde{O}}(\cdot)$ hides polylogarithmic factors in n.
We study the problem of parallelizing sampling from distributions related to determinants: symmetric, nonsymmetric, and partition-constrained determinantal point processes, as well as planar perfect matchings. For these distributions, the partition … We study the problem of parallelizing sampling from distributions related to determinants: symmetric, nonsymmetric, and partition-constrained determinantal point processes, as well as planar perfect matchings. For these distributions, the partition function, a.k.a. the count, can be obtained via matrix determinants, a highly parallelizable computation; Csanky proved it is in NC. However, parallel counting does not automatically translate to parallel sampling, as classic reductions between the two are inherently sequential. We show that a nearly quadratic parallel speedup over sequential sampling can be achieved for all the aforementioned distributions. If the distribution is supported on subsets of size $k$ of a ground set, we show how to approximately produce a sample in $\widetilde{O}(k^{\frac{1}{2} + c})$ time with polynomially many processors for any constant $c>0$. In the two special cases of symmetric determinantal point processes and planar perfect matchings, our bound improves to $\widetilde{O}(\sqrt k)$ and we show how to sample exactly in these cases. As our main technical contribution, we fully characterize the limits of batching for the steps of sampling-to-counting reductions. We observe that only $O(1)$ steps can be batched together if we strive for exact sampling, even in the case of nonsymmetric determinantal point processes. However, we show that for approximate sampling, $\widetilde{\Omega}(k^{\frac{1}{2}-c})$ steps can be batched together, for any entropically independent distribution, which includes all mentioned classes of determinantal point processes. Entropic independence and related notions have been the source of breakthroughs in Markov chain analysis in recent years, so we expect our framework to prove useful for distributions beyond those studied in this work.
We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include random spanning tree distributions and determinantal point processes. For a graph $G=(V, E)$, we show how to … We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include random spanning tree distributions and determinantal point processes. For a graph $G=(V, E)$, we show how to approximately sample uniformly random spanning trees from $G$ in $\widetilde{O}(\lvert V\rvert)$ time per sample after an initial $\widetilde{O}(\lvert E\rvert)$ time preprocessing. For a determinantal point process on subsets of size $k$ of a ground set of $n$ elements, we show how to approximately sample in $\widetilde{O}(k^\omega)$ time after an initial $\widetilde{O}(nk^{\omega-1})$ time preprocessing, where $\omega<2.372864$ is the matrix multiplication exponent. We even improve the state of the art for obtaining a single sample from determinantal point processes, from the prior runtime of $\widetilde{O}(\min\{nk^2, n^\omega\})$ to $\widetilde{O}(nk^{\omega-1})$. In our main technical result, we achieve the optimal limit on domain sparsification for strongly Rayleigh distributions. In domain sparsification, sampling from a distribution $\mu$ on $\binom{[n]}{k}$ is reduced to sampling from related distributions on $\binom{[t]}{k}$ for $t\ll n$. We show that for strongly Rayleigh distributions, we can can achieve the optimal $t=\widetilde{O}(k)$. Our reduction involves sampling from $\widetilde{O}(1)$ domain-sparsified distributions, all of which can be produced efficiently assuming convenient access to approximate overestimates for marginals of $\mu$. Having access to marginals is analogous to having access to the mean and covariance of a continuous distribution, or knowing "isotropy" for the distribution, the key assumption behind the Kannan-Lov\'asz-Simonovits (KLS) conjecture and optimal samplers based on it. We view our result as a moral analog of the KLS conjecture and its consequences for sampling, for discrete strongly Rayleigh measures.
Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 11 December 2019Accepted: 27 August 2021Published online: 18 November 2021KeywordsBethe, permanent, matching, approximation … Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 11 December 2019Accepted: 27 August 2021Published online: 18 November 2021KeywordsBethe, permanent, matching, approximation algorithms, deterministic algorithmsAMS Subject Headings68W25, 15A15, 05C70, 60C05Publication DataISSN (print): 0097-5397ISSN (online): 1095-7111Publisher: Society for Industrial and Applied MathematicsCODEN: smjcat
We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative … We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to variance contraction, a.k.a. a Poincare inequality, which is significantly easier to establish through, e.g., spectral analysis. Motivated by this observation, we study restricted modified log-Sobolev inequalities that guarantee entropy contraction not for all starting distributions, but for those in a large neighborhood of the stationary distribution. We show how to sample from the hardcore and Ising models on $n$-node graphs that have a constant $\delta$ relative gap to the tree-uniqueness threshold, in nearly-linear time $\widetilde O_{\delta}(n)$. Notably, our bound does not depend on the maximum degree $\Delta$, and is therefore optimal even for high-degree graphs. This improves on prior mixing time bounds of $\widetilde O_{\delta, \Delta}(n)$ and $\widetilde O_{\delta}(n^2)$, established via (non-restricted) modified log-Sobolev and Poincare inequalities respectively. We further show that optimal concentration inequalities can still be achieved from the restricted form of modified log-Sobolev inequalities. To establish restricted entropy contraction, we extend the entropic independence framework of Anari, Jain, Koehler, Pham, and Vuong to the setting of distributions that are spectrally independent under a restricted set of external fields. We also develop an orthogonal trick that might be of independent interest: utilizing Bernoulli factories we show how to implement Glauber dynamics updates on high-degree graphs in $O(1)$ time, assuming standard adjacency array representation of the graph.
We give a deterministic polynomial-time 2O(r)-approximation algorithm for the number of bases of a given matroid of rank r and the number of common bases of any two matroids of … We give a deterministic polynomial-time 2O(r)-approximation algorithm for the number of bases of a given matroid of rank r and the number of common bases of any two matroids of rank r. To the best of our knowledge, this is the first nontrivial deterministic approximation algorithm that works for arbitrary matroids. Based on a lower bound of Azar, Broder, and Frieze, this is almost the best possible result assuming oracle access to independent sets of the matroid. There are two main ingredients in our result. For the first, we build upon recent results of Adiprasito, Huh, Katz, and Wang on combinatorial Hodge theory to show that the basis generating polynomial of any matroid is a (completely) log-concave polynomial. Formally, we prove that the multivariate generating polynomial of the bases of any matroid is (and all of its directional derivatives along the positive orthant are) log-concave as functions over the positive orthant. For the second ingredient, we develop a general framework for approximate counting in discrete problems, based on convex optimization. The connection goes through subadditivity of the entropy. For matroids, we prove that an approximate superadditivity of the entropy holds by relying on the log-concavity of the basis generating polynomial.
Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward … Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward function. This assumption does not always hold including in settings where multiple experts provide data or when a single expert provides data for different tasks -- we thus go beyond learning a unimodal reward and focus on learning a multimodal reward function. We formulate the multimodal reward learning as a mixture learning problem and develop a novel ranking-based learning approach, where the experts are only required to rank a given set of trajectories. Furthermore, as access to interaction data is often expensive in robotics, we develop an active querying approach to accelerate the learning process. We conduct experiments and user studies using a multi-task variant of OpenAI's LunarLander and a real Fetch robot, where we collect data from multiple users with different preferences. The results suggest that our approach can efficiently learn multimodal reward functions, and improve data-efficiency over benchmark methods that we adapt to our learning problem.
Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 18 September 2020Accepted: 26 March 2021Published online: 12 July 2021Keywordsapproximate counting, Markov chain … Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 18 September 2020Accepted: 26 March 2021Published online: 12 July 2021Keywordsapproximate counting, Markov chain Monte Carlo, Glauber dynamics, spectral independence, high-dimensional expanders, correlation decayAMS Subject Headings60J10, 68Q87, 68W20Publication DataISSN (print): 0097-5397ISSN (online): 1095-7111Publisher: Society for Industrial and Applied MathematicsCODEN: smjcat
In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing … In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d. samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. To obtain our results on PML we establish new connections between PML and the well-studied Bethe and Sinkhorn approximations to the permanent (Vontobel, 2012 and 2014). It is known that the PML objective is proportional to the permanent of a certain Vandermonde matrix (Vontobel, 2012) with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. This allows us to show that the convex approximation to computing PML distributions studied in (Charikar, Shiragur and Sidford, 2019) is governed, in part, by the quality of Sinkhorn approximations to the permanent. We show that both Bethe and Sinkhorn permanents are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$. This improves upon the previous known bounds of $\exp(O(N))$ and combining these insights with careful rounding of the convex relaxation yields our results.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with log-concave polynomials of homogeneous degree k on n variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over size-k subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a kO(k)-approximation of maxS µ(S) when µ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog2 n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k1 and k2 in time O(n log2 n), for fixed parameters k1, k2.
We introduce a notion called entropic independence for distributions $\mu$ defined on pure simplicial complexes, i.e., subsets of size $k$ of a ground set of elements. Informally, we call a … We introduce a notion called entropic independence for distributions $\mu$ defined on pure simplicial complexes, i.e., subsets of size $k$ of a ground set of elements. Informally, we call a background measure $\mu$ entropically independent if for any (possibly randomly chosen) set $S$, the relative entropy of an element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$, a constant multiple of its ``share of entropy.'' Entropic independence is the natural analog of spectral independence, another recently established notion, if one replaces variance by entropy. In our main result, we show that $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ can be upper bounded by its linear tangent, a property implied by concavity of the said transformation. We further show that this concavity is equivalent to spectral independence under arbitrary external fields, an assumption that also goes by the name of fractional log-concavity. Our result can be seen as a new tool to establish entropy contraction from the much simpler variance contraction inequalities. A key differentiating feature of our result is that we make no assumptions on marginals of $\mu$ or the degrees of the underlying graphical model when $\mu$ is based on one. We leverage our results to derive tight modified log-Sobolev inequalities for multi-step down-up walks on fractionally log-concave distributions. As our main application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models with interaction matrix of operator norm smaller than $1$, improving upon the prior quadratic dependence on $n$.
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of … We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. To derive our results, we relate entropic independence to properties of polynomials: $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions. As our flagship application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than $1$, improving upon the prior quadratic dependence on $n$.
A polynomial p∈R[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial … A polynomial p∈R[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial of a real stable polynomial p with nonnegative coefficients. This fundamental inequality has been used to attack several counting and optimization problems. Here, we study a more general question: Given a stable multilinear polynomial p with nonnegative coefficients and a set of monomials S, we show that if the polynomial obtained by summing up all monomials in S is real stable, then we can lowerbound the sum of coefficients of monomials of p that are in S. We also prove generalizations of this theorem to (real stable) polynomials that are not multilinear. We use our theorem to give a new proof of Schrijver's inequality on the number of perfect matchings of a regular bipartite graph, generalize a recent result of Nikolov and Singh [23], and give deterministic polynomial time approximation algorithms for several counting problems.
Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. Although these models have been quite popular, the solutions obtained via … Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. Although these models have been quite popular, the solutions obtained via this approach are unstable to perturbations in data defining the submodular functions. Robust submodular maximization has been proposed as a richer model that aims to overcome this discrepancy as well as increase the modeling scope of submodular optimization. In this work, we consider robust submodular maximization with structured combinatorial constraints and give efficient algorithms with provable guarantees. Our approach is applicable to constraints defined by single or multiple matroids and knapsack as well as distributionally robust criteria. We consider both the offline setting where the data defining the problem are known in advance and the online setting where the input data are revealed over time. For the offline setting, we give a general (nearly) optimal bicriteria approximation algorithm that relies on new extensions of classical algorithms for submodular maximization. For the online version of the problem, we give an algorithm that returns a bicriteria solution with sublinear regret. Summary of Contribution: Constrained submodular maximization is one of the core areas in combinatorial optimization with a wide variety of applications in operations research and computer science. Over the last decades, both communities have been interested on the design and analysis of new algorithms with provable guarantees. Sensor location, influence maximization and data summarization are some of the applications of submodular optimization that lie at the intersection of the aforementioned communities. Particularly, our work focuses on optimizing several submodular functions simultaneously. We provide new insights and algorithms to the offline and online variants of the problem which significantly expand the related literature. At the same time, we provide a computational study that supports our theoretical results.
Determinantal point processes (DPPs) are widely popular probabilistic models used in machine learning to capture diversity in random subsets of items. While traditional DPPs are defined by a symmetric kernel … Determinantal point processes (DPPs) are widely popular probabilistic models used in machine learning to capture diversity in random subsets of items. While traditional DPPs are defined by a symmetric kernel matrix, recent work has shown a significant increase in the modeling power and applicability of models defined by nonsymmetric kernels, where the model can capture interactions that go beyond diversity. We study the problem of maximum a posteriori (MAP) inference for determinantal point processes defined by a nonsymmetric positive semidefinite matrix (NDPPs), where the goal is to find the maximum $k\times k$ principal minor of the kernel matrix $L$. We obtain the first multiplicative approximation guarantee for this problem using local search, a method that has been previously applied to symmetric DPPs. Our approximation factor of $k^{O(k)}$ is nearly tight, and we show theoretically and experimentally that it compares favorably to the state-of-the-art methods for this problem that are based on greedy maximization. The main new insight enabling our improved approximation factor is that we allow local search to update up to two elements of the solution in each iteration, and we show this is necessary to have any multiplicative approximation guarantee.
We show a connection between sampling and optimization on discrete domains. For a family of distributions $\mu$ defined on size $k$ subsets of a ground set of elements that is … We show a connection between sampling and optimization on discrete domains. For a family of distributions $\mu$ defined on size $k$ subsets of a ground set of elements that is closed under external fields, we show that rapid mixing of natural local random walks implies the existence of simple approximation algorithms to find $\max \mu(\cdot)$. More precisely we show that if (multi-step) down-up random walks have spectral gap at least inverse polynomially large in $k$, then (multi-step) local search can find $\max \mu(\cdot)$ within a factor of $k^{O(k)}$. As the main application of our result, we show a simple nearly-optimal $k^{O(k)}$-factor approximation algorithm for MAP inference on nonsymmetric DPPs. This is the first nontrivial multiplicative approximation for finding the largest size $k$ principal minor of a square (not-necessarily-symmetric) matrix $L$ with $L+L^\intercal\succeq 0$. We establish the connection between sampling and optimization by showing that an exchange inequality, a concept rooted in discrete convex analysis, can be derived from fast mixing of local random walks. We further connect exchange inequalities with composable core-sets for optimization, generalizing recent results on composable core-sets for DPP maximization to arbitrary distributions that satisfy either the strongly Rayleigh property or that have a log-concave generating polynomial.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by [Jer87] to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by [ALO20], providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this extends a classic result established by [Gar59] who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been … We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial-time solvable; the exact number of arborescences can be computed by a determinant [Tutte, 1948], and sampling can be reduced to counting [Jerrum et al., 1986; Jerrum and Sinclair, 1996]. However, the classic reduction from sampling to counting seems to be inherently sequential. This raises the question of designing efficient parallel algorithms for sampling. We show that sampling arborescences can be done in RNC. For several well-studied combinatorial structures, counting can be reduced to the computation of a determinant, which is known to be in NC [Csanky, 1975]. These include arborescences, planar graph perfect matchings, Eulerian tours in digraphs, and determinantal point processes. However, not much is known about efficient parallel sampling of these structures. Our work is a step towards resolving this mystery.
We present a framework for speeding up the time it takes to sample from discrete distributions $μ$ defined over subsets of size $k$ of a ground set of $n$ elements, … We present a framework for speeding up the time it takes to sample from discrete distributions $μ$ defined over subsets of size $k$ of a ground set of $n$ elements, in the regime $k\ll n$. We show that having estimates of marginals $\mathbb{P}_{S\sim μ}[i\in S]$, the task of sampling from $μ$ can be reduced to sampling from distributions $ν$ supported on size $k$ subsets of a ground set of only $n^{1-α}\cdot \operatorname{poly}(k)$ elements. Here, $1/α\in [1, k]$ is the parameter of entropic independence for $μ$. Further, the sparsified distributions $ν$ are obtained by applying a sparse (mostly $0$) external field to $μ$, an operation that often retains algorithmic tractability of sampling from $ν$. This phenomenon, which we dub domain sparsification, allows us to pay a one-time cost of estimating the marginals of $μ$, and in return reduce the amortized cost needed to produce many samples from the distribution $μ$, as is often needed in upstream tasks such as counting and inference. For a wide range of distributions where $α=Ω(1)$, our result reduces the domain size, and as a corollary, the cost-per-sample, by a $\operatorname{poly}(n)$ factor. Examples include monomers in a monomer-dimer system, non-symmetric determinantal point processes, and partition-constrained Strongly Rayleigh measures. Our work significantly extends the reach of prior work of Anari and Dereziński who obtained domain sparsification for distributions with a log-concave generating polynomial (corresponding to $α=1$). As a corollary of our new analysis techniques, we also obtain a less stringent requirement on the accuracy of marginal estimates even for the case of log-concave polynomials; roughly speaking, we show that constant-factor approximation is enough for domain sparsification, improving over $O(1/k)$ relative error established in prior work.
We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative … We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to variance contraction, a.k.a. a Poincare inequality, which is significantly easier to establish through, e.g., spectral analysis. Motivated by this observation, we study restricted modified log-Sobolev inequalities that guarantee entropy contraction not for all starting distributions, but for those in a large neighborhood of the stationary distribution. We show how to sample from the hardcore and Ising models on $n$-node graphs that have a constant $\delta$ relative gap to the tree-uniqueness threshold, in nearly-linear time $\widetilde O_{\delta}(n)$. Notably, our bound does not depend on the maximum degree $\Delta$, and is therefore optimal even for high-degree graphs. This improves on prior mixing time bounds of $\widetilde O_{\delta, \Delta}(n)$ and $\widetilde O_{\delta}(n^2)$, established via (non-restricted) modified log-Sobolev and Poincare inequalities respectively. We further show that optimal concentration inequalities can still be achieved from the restricted form of modified log-Sobolev inequalities. To establish restricted entropy contraction, we extend the entropic independence framework of Anari, Jain, Koehler, Pham, and Vuong to the setting of distributions that are spectrally independent under a restricted set of external fields. We also develop an orthogonal trick that might be of independent interest: utilizing Bernoulli factories we show how to implement Glauber dynamics updates on high-degree graphs in $O(1)$ time, assuming standard adjacency array representation of the graph.
Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward … Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward function. This assumption does not always hold including in settings where multiple experts provide data or when a single expert provides data for different tasks -- we thus go beyond learning a unimodal reward and focus on learning a multimodal reward function. We formulate the multimodal reward learning as a mixture learning problem and develop a novel ranking-based learning approach, where the experts are only required to rank a given set of trajectories. Furthermore, as access to interaction data is often expensive in robotics, we develop an active querying approach to accelerate the learning process. We conduct experiments and user studies using a multi-task variant of OpenAI's LunarLander and a real Fetch robot, where we collect data from multiple users with different preferences. The results suggest that our approach can efficiently learn multimodal reward functions, and improve data-efficiency over benchmark methods that we adapt to our learning problem.
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of … We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. To derive our results, we relate entropic independence to properties of polynomials: $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions. As our flagship application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than $1$, improving upon the prior quadratic dependence on $n$.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by [Jer87] to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by [ALO20], providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this extends a classic result established by [Gar59] who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We show a connection between sampling and optimization on discrete domains. For a family of distributions $μ$ defined on size $k$ subsets of a ground set of elements that is … We show a connection between sampling and optimization on discrete domains. For a family of distributions $μ$ defined on size $k$ subsets of a ground set of elements that is closed under external fields, we show that rapid mixing of natural local random walks implies the existence of simple approximation algorithms to find $\max μ(\cdot)$. More precisely we show that if (multi-step) down-up random walks have spectral gap at least inverse polynomially large in $k$, then (multi-step) local search can find $\max μ(\cdot)$ within a factor of $k^{O(k)}$. As the main application of our result, we show a simple nearly-optimal $k^{O(k)}$-factor approximation algorithm for MAP inference on nonsymmetric DPPs. This is the first nontrivial multiplicative approximation for finding the largest size $k$ principal minor of a square (not-necessarily-symmetric) matrix $L$ with $L+L^\intercal\succeq 0$. We establish the connection between sampling and optimization by showing that an exchange inequality, a concept rooted in discrete convex analysis, can be derived from fast mixing of local random walks. We further connect exchange inequalities with composable core-sets for optimization, generalizing recent results on composable core-sets for DPP maximization to arbitrary distributions that satisfy either the strongly Rayleigh property or that have a log-concave generating polynomial.
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been … We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial-time solvable; the exact number of arborescences can be computed by a determinant [Tut48], and sampling can be reduced to counting [JVV86, JS96]. However, the classic reduction from sampling to counting seems to be inherently sequential. This raises the question of designing efficient parallel algorithms for sampling. We show that sampling arborescences can be done in RNC. For several well-studied combinatorial structures, counting can be reduced to the computation of a determinant, which is known to be in NC [Csa75]. These include arborescences, planar graph perfect matchings, Eulerian tours in digraphs, and determinantal point processes. However, not much is known about efficient parallel sampling of these structures. Our work is a step towards resolving this mystery.
We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution … We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if a distribution is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes [KM17]; [DK17]; [KO18]; [AL20], this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from the given distribution. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm [Wei06] for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics [LV97]; [LV99]; [DG00]; [Vig01]; [Eft+16].
We define a notion of isotropy for discrete set distributions. If μ is a distribution over subsets S of a ground set [ n], we say that μ is in … We define a notion of isotropy for discrete set distributions. If μ is a distribution over subsets S of a ground set [ n], we say that μ is in isotropic position if \mathbbPS ~ μ[e ∈ S] is the same for all e ∈ [n]. We design a new approximate sampling algorithm that leverages isotropy for the class of distributions μ that have a log-concave generating polynomial; this class includes determinantal point processes, strongly Rayleigh distributions, and uniform distributions over matroid bases. We show that when μ is in approximately isotropic position, the running time of our algorithm depends polynomially on the size of the set S, and only logarithmically on n. When n is much larger than the size of S, this is significantly faster than prior algorithms, and can even be sublinear in n. We then show how to transform a non-isotropic μ into an equivalent approximately isotropic form with a polynomial-time pre-processing step, accelerating subsequent sampling times. The main new ingredient enabling our algorithms is a class of negative dependence inequalities that may be of independent interest. As an application of our results, we show how to approximately count bases of a matroid of rank k over a ground set of n elements to within a factor of 1+ε in time O((n+1/ε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) ·poly(k,logn)). This is the first algorithm that runs in nearly linear time for fixed rank k, and achieves an inverse polynomially low approximation error. The full version of this paper is available at: https://arxiv.org/abs/2004.09079.
Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three … Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three decades, ever since the discovery of RNC perfect matching algorithms. Within this question, the case of planar graphs has remained an enigma: On the one hand, counting the number of perfect matchings is far harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution. In this article, we give an NC algorithm for finding a perfect matching in a planar graph. Our algorithm uses the above-stated fact about counting perfect matchings in a crucial way. Our main new idea is an NC algorithm for finding a face of the perfect matching polytope at which a set (which could be polynomially large) of conditions, involving constraints of the polytope, are simultaneously satisfied. Several other ideas are also needed, such as finding, in NC, a point in the interior of the minimum-weight face of this polytope and finding a balanced tight odd set.
We define a notion of isotropy for discrete set distributions. If $\mu$ is a distribution over subsets $S$ of a ground set $[n]$, we say that $\mu$ is in isotropic … We define a notion of isotropy for discrete set distributions. If $\mu$ is a distribution over subsets $S$ of a ground set $[n]$, we say that $\mu$ is in isotropic position if $P[e \in S]$ is the same for all $e\in [n]$. We design a new approximate sampling algorithm that leverages isotropy for the class of distributions $\mu$ that have a log-concave generating polynomial; this class includes determinantal point processes, strongly Rayleigh distributions, and uniform distributions over matroid bases. We show that when $\mu$ is in approximately isotropic position, the running time of our algorithm depends polynomially on the size of the set $S$, and only logarithmically on $n$. When $n$ is much larger than the size of $S$, this is significantly faster than prior algorithms, and can even be sublinear in $n$. We then show how to transform a non-isotropic $\mu$ into an equivalent approximately isotropic form with a polynomial-time preprocessing step, accelerating subsequent sampling times. The main new ingredient enabling our algorithms is a class of negative dependence inequalities that may be of independent interest. As an application of our results, we show how to approximately count bases of a matroid of rank $k$ over a ground set of $n$ elements to within a factor of $1+\epsilon$ in time $ O((n+1/\epsilon^2)\cdot poly(k, \log n))$. This is the first algorithm that runs in nearly linear time for fixed rank $k$, and achieves an inverse polynomially low approximation error.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ on a ground set of $n$ elements, or more generally distributions associated with log-concave polynomials of homogeneous degree $k$ on $n$ variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time $O(k\log k)$. Our bound has no dependence on $n$ or the starting point, unlike the previous analyses [ALOV19, CGM19], and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. Additionally, we show how to leverage down-up random walks to approximately sample spanning trees in a graph with $n$ edges in time $O(n\log^2 n)$, improving on the almost-linear time algorithm by Schild [Sch18]. Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time.
We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ … We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank $k$ on a ground set of $n$ elements, or more generally distributions associated with log-concave polynomials of homogeneous degree $k$ on $n$ variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time $O(k\log k)$. Our bound has no dependence on $n$ or the starting point, unlike the previous analyses [ALOV19,CGM19], and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given function $\mu: {[n] \choose k} \to \mathbb{R}_{\geq 0},$ our approximate exchange property implies that a simple local search algorithm gives a $k^{O(k)}$-approximation of $\max_{S} \mu(S)$ when $\mu$ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when $\mu$ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with $n$ edges in time $O(n\log^2 n).$ The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by \cite{ALOV19, CGM19}. For spanning tree, we improve on the almost-linear time algorithm by [Sch18]. Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems.
Given a matrix A and k ≥ 0, we study the problem of finding the k × k submatrix of A with the maximum determinant in absolute value. This problem … Given a matrix A and k ≥ 0, we study the problem of finding the k × k submatrix of A with the maximum determinant in absolute value. This problem is motivated by the question of computing the determinant-based lower bound of cite{LSV86} on hereditary discrepancy, which was later shown to be an approximate upper bound as well [Matousek, 2013]. The special case where k coincides with one of the dimensions of A has been extensively studied. Nikolov gave a 2^{O(k)}-approximation algorithm for this special case, matching known lower bounds; he also raised as an open problem the question of designing approximation algorithms for the general case. We make progress towards answering this question by giving the first efficient approximation algorithm for general k× k subdeterminant maximization with an approximation ratio that depends only on k. Our algorithm finds a k^{O(k)}-approximate solution by performing a simple local search. Our main technical contribution, enabling the analysis of the approximation ratio, is an extension of Plucker relations for the Grassmannian, which may be of independent interest; Plucker relations are quadratic polynomial equations involving the set of k× k subdeterminants of a k× n matrix. We find an extension of these relations to k× k subdeterminants of general m× n matrices.
We say a probability distribution $\mu$ is spectrally independent if an associated correlation matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove … We say a probability distribution $\mu$ is spectrally independent if an associated correlation matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if $\mu$ is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes \cite{KM17,DK17,KO18,AL19}, this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from $\mu$. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm \cite{Wei06} for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics \cite{LV97,LV99,DG00,Vig01,EHSVY16}.
In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing … In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d.\ samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $\exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.
We introduce the class of<italic>strongly Rayleigh</italic>probability measures by means of geometric properties of their generating polynomials that amount to the stability of the latter. This class covers important models such … We introduce the class of<italic>strongly Rayleigh</italic>probability measures by means of geometric properties of their generating polynomials that amount to the stability of the latter. This class covers important models such as determinantal measures (e.g. product measures and uniform random spanning tree measures) and distributions for symmetric exclusion processes. We show that strongly Rayleigh measures enjoy all virtues of negative dependence, and we also prove a series of conjectures due to Liggett, Pemantle, and Wagner, respectively. Moreover, we extend Lyons’ recent results on determinantal measures, and we construct counterexamples to several conjectures of Pemantle and Wagner on negative dependence and ultra log-concave rank sequences.
We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model … We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model of any matroid in the regime where 0<q<1. Consequently, we can sample random spanning forests in a graph and estimate the reliability polynomial of any matroid. We also prove the thirty year old conjecture of Mihail and Vazirani that the bases exchange graph of any matroid has edge expansion at least 1.
We present a polynomial-time randomized algorithm for estimating the permanent of an arbitrary n × n matrix with nonnegative entries. This algorithm---technically a "fully-polynomial randomized approximation scheme"---computes an approximation that … We present a polynomial-time randomized algorithm for estimating the permanent of an arbitrary n × n matrix with nonnegative entries. This algorithm---technically a "fully-polynomial randomized approximation scheme"---computes an approximation that is, with high probability, within arbitrarily small specified relative error of the true value of the permanent.
Random walks on bounded degree expander graphs have numerous applications, both in theoretical and practical computational problems. A key property of these walks is that they converge rapidly to their … Random walks on bounded degree expander graphs have numerous applications, both in theoretical and practical computational problems. A key property of these walks is that they converge rapidly to their stationary distribution. In this work we {\em define high order random walks}: These are generalizations of random walks on graphs to high dimensional simplicial complexes, which are the high dimensional analogues of graphs. A simplicial complex of dimension $d$ has vertices, edges, triangles, pyramids, up to $d$-dimensional cells. For any $0 \leq i &lt; d$, a high order random walk on dimension $i$ moves between neighboring $i$-faces (e.g., edges) of the complex, where two $i$-faces are considered neighbors if they share a common $(i+1)$-face (e.g., a triangle). The case of $i=0$ recovers the well studied random walk on graphs. We provide a {\em local-to-global criterion} on a complex which implies {\em rapid convergence of all high order random walks} on it. Specifically, we prove that if the $1$-dimensional skeletons of all the links of a complex are spectral expanders, then for {\em all} $0 \le i &lt; d$ the high order random walk on dimension $i$ converges rapidly to its stationary distribution. We derive our result through a new notion of high dimensional combinatorial expansion of complexes which we term {\em colorful expansion}. This notion is a natural generalization of combinatorial expansion of graphs and is strongly related to the convergence rate of the high order random walks. We further show an explicit family of {\em bounded degree} complexes which satisfy this criterion. Specifically, we show that Ramanujan complexes meet this criterion, and thus form an explicit family of bounded degree high dimensional simplicial complexes in which all of the high order random walks converge rapidly to their stationary distribution.
We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution … We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if a distribution is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes [KM17]; [DK17]; [KO18]; [AL20], this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from the given distribution. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm [Wei06] for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics [LV97]; [LV99]; [DG00]; [Vig01]; [Eft+16].
Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which … Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which become intractable and hard to approximate in the presence of negative correlations, DPPs offer efficient and exact algorithms for sampling, marginalization, conditioning, and other inference tasks. We provide a gentle introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and show how DPPs can be applied to real-world applications like finding diverse sets of high-quality search results, building informative summaries by selecting diverse sentences from documents, modeling non-overlapping human poses in images or video, and automatically building timelines of important news stories.
We give a deterministic polynomial time $2^{O(r)}$-approximation algorithm for the number of bases of a given matroid of rank $r$ and the number of common bases of any two matroids … We give a deterministic polynomial time $2^{O(r)}$-approximation algorithm for the number of bases of a given matroid of rank $r$ and the number of common bases of any two matroids of rank $r$. To the best of our knowledge, this is the first nontrivial deterministic approximation algorithm that works for arbitrary matroids. Based on a lower bound of Azar, Broder, and Frieze [ABF94] this is almost the best possible result assuming oracle access to independent sets of the matroid. There are two main ingredients in our result: For the first, we build upon recent results of Adiprasito, Huh, and Katz [AHK15] and Huh and Wang [HW17] on combinatorial hodge theory to derive a connection between matroids and log-concave polynomials. We expect that several new applications in approximation algorithms will be derived from this connection in future. Formally, we prove that the multivariate generating polynomial of the bases of any matroid is log-concave as a function over the positive orthant. For the second ingredient, we develop a general framework for approximate counting in discrete problems, based on convex optimization. The connection goes through subadditivity of the entropy. For matroids, we prove that an approximate superadditivity of the entropy holds by relying on the log-concavity of the corresponding polynomials.
We study high order random walks on high dimensional expanders on simplicial complexes (i.e., hypergraphs). These walks walk from a k-face (i.e., a k-hyperedge) to a k-face if they are … We study high order random walks on high dimensional expanders on simplicial complexes (i.e., hypergraphs). These walks walk from a k-face (i.e., a k-hyperedge) to a k-face if they are both contained in a k+1-face (i.e, a k+1 hyperedge). This naturally generalizes the random walks on graphs that walk from a vertex (0-face) to a vertex if they are both contained in an edge (1-face). Recent works have studied the spectrum of high order walks operators and deduced fast mixing. However, the spectral gap of high order walks operators is inherently small, due to natural obstructions (called coboundaries) that do not happen for walks on expander graphs. In this work we go beyond spectral gap, and relate the expansion of a function on k-faces (called k-cochain, for k=0, this is a function on vertices) to its structure. We show a Decomposition Theorem: For every k-cochain defined on high dimensional expander, there exists a decomposition of the cochain into i-cochains such that the square norm of the k-cochain is a sum of the square norms of the i-chains and such that the more weight the k-cochain has on higher levels of the decomposition the better is its expansion, or equivalently, the better is its shrinkage by the high order random walk operator. The following corollaries are implied by the Decomposition Theorem: - We characterize highly expanding k-cochains as those whose mass is concentrated on the highest levels of the decomposition that we construct. For example, a function on edges (i.e. a 1-cochain) which is locally thin (i.e. it contains few edges through every vertex) is highly expanding, while a function on edges that contains all edges through a single vertex is not highly expanding. - We get optimal mixing for high order random walks on Ramanujan complexes. Ramanujan complexes are recently discovered bounded degree high dimensional expanders. The optimality in their mixing that we prove here, enable us to get from them more efficient Two-Layer-Samplers than those presented by the previous work of Dinur and Kaufman.
A random walk on a finite graph can be used to construct a uniform random spanning tree. It is shown how random walk techniques can be applied to the study … A random walk on a finite graph can be used to construct a uniform random spanning tree. It is shown how random walk techniques can be applied to the study of several properties of the uniform random spanning tree: the proportion of leaves, the distribution of degrees, and the diameter.
The motivation of this work is to extend the techniques of higher order random walks on simplicial complexes to analyze mixing times of Markov chains for combinatorial problems. Our main … The motivation of this work is to extend the techniques of higher order random walks on simplicial complexes to analyze mixing times of Markov chains for combinatorial problems. Our main result is a sharp upper bound on the second eigenvalue of the down-up walk on a pure simplicial complex, in terms of the second eigenvalues of its links. We show some applications of this result in analyzing mixing times of Markov chains, including sampling independent sets of a graph and sampling common independent sets of two partition matroids.
The maximum volume j-simplex problem asks to compute the j-dimensional simplex of maximum volume inside the convex hull of a given set of n points in Qd. We give a … The maximum volume j-simplex problem asks to compute the j-dimensional simplex of maximum volume inside the convex hull of a given set of n points in Qd. We give a deterministic approximation algorithm for this problem which achieves an approximation ratio of ej/2 + o(j). The problem is known to be NP-hard to approximate within a factor of cj for some constant c > 1. Our algorithm also gives a factor ej + o(j) approximation for the problem of finding the principal j x j submatrix of a rank d positive semidefinite matrix with the largest determinant. We achieve our approximation by rounding solutions to a generalization of the D-optimal design problem, or, equivalently, the dual of an appropriate smallest enclosing ellipsoid problem. Our arguments give a short and simple proof of a restricted invertibility principle for determinants.
Hyperbolic polynomials have their origins in partial differential equations. We show in this paper that they have applications in interior point methods for convex programming. Each homogeneous hyperbolic polynomial p … Hyperbolic polynomials have their origins in partial differential equations. We show in this paper that they have applications in interior point methods for convex programming. Each homogeneous hyperbolic polynomial p has an associated open and convex cone called its hyperbolicity cone. We give an explicit representation of this cone in terms of polynomial inequalities. The function F(x) = −log p(x) is a logarithmically homogeneous self-concordant barrier function for the hyperbolicity cone with barrier parameter equal to the degree of p. The function F(x) possesses striking additional properties that are useful in designing long-step interior point methods. For example, we show that the long-step primal potential reduction methods of Nesterov and Todd and the surface-following methods of Nesterov and Nemirovskii extend to hyperbolic barrier functions. We also show that there exists a hyperbolic barrier function on every homogeneous cone.
We prove the hard Lefschetz theorem and the Hodge-Riemann relations for a commutative ring associated to an arbitrary matroid M. We use the Hodge-Riemann relations to resolve a conjecture of … We prove the hard Lefschetz theorem and the Hodge-Riemann relations for a commutative ring associated to an arbitrary matroid M. We use the Hodge-Riemann relations to resolve a conjecture of Heron, Rota, and Welsh that postulates the log-concavity of the coefficients of the characteristic polynomial of M. We furthermore conclude that the $f$-vector of the independence complex of a matroid forms a log-concave sequence, proving a conjecture of Mason and Welsh for general matroids.
We develop bounds for the second largest eigenvalue and spectral gap of a reversible Markov chain. The bounds depend on geometric quantities such as the maximum degree, diameter and covering … We develop bounds for the second largest eigenvalue and spectral gap of a reversible Markov chain. The bounds depend on geometric quantities such as the maximum degree, diameter and covering number of associated graphs. The bounds compare well with exact answers for a variety of simple chains and seem better than bounds derived through Cheeger-like inequalities. They offer improved rates of convergence for the random walk associated to approximate computation of the permanent.
We give an m1+o(1)βo(1)-time algorithm for generating uniformly random spanning trees in weighted graphs with max-to-min weight ratio β. In the process, we illustrate how fundamental tradeoffs in graph partitioning … We give an m1+o(1)βo(1)-time algorithm for generating uniformly random spanning trees in weighted graphs with max-to-min weight ratio β. In the process, we illustrate how fundamental tradeoffs in graph partitioning can be overcome by eliminating vertices from a graph using Schur complements of the associated Laplacian matrix.
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on … We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.
We show that the modified log-Sobolev constant for a natural Markov chain which converges to an r-homogeneous strongly log-concave distribution is at least 1/r. Applications include an asymptotically optimal mixing … We show that the modified log-Sobolev constant for a natural Markov chain which converges to an r-homogeneous strongly log-concave distribution is at least 1/r. Applications include an asymptotically optimal mixing time bound for the bases-exchange walk for matroids, and a concentration bound for Lipschitz functions over these distributions.
We give new lower and upper bounds on the permanent of a doubly stochastic matrix. Combined with previous work, this improves on the deterministic approximation factor. We also give a … We give new lower and upper bounds on the permanent of a doubly stochastic matrix. Combined with previous work, this improves on the deterministic approximation factor. We also give a combinatorial application of the lower bound, proving S. Friedland's "Asymptotic Lower Matching Conjecture"for the monomer-dimer problem.
A polynomial pΕℝ[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial … A polynomial pΕℝ[z1,…,zn] is real stable if it has no roots in the upper-half complex plane. Gurvits's permanent inequality gives a lower bound on the coefficient of the z1z2…zn monomial of a real stable polynomial p with nonnegative coefficients. This fundamental inequality has been used to attack several counting and optimization problems. Here, we study a more general question: Given a stable multilinear polynomial p with nonnegative coefficients and a set of monomials S, we show that if the polynomial obtained by summing up all monomials in S is real stable, then we can lower bound the sum of coefficients of monomials of p that are in S. We also prove generalizations of this theorem to (real stable) polynomials that are not multilinear. We use our theorem to give a new proof of Schrijver's inequality on the number of perfect matchings of a regular bipartite graph, generalize a recent result of Nikolov and Singh, and give deterministic polynomial time approximation algorithms for several counting problems.
Several fundamental optimization and counting problems arising in computer science, mathematics and physics can be reduced to one of the following computational tasks involving polynomials and set systems: given an … Several fundamental optimization and counting problems arising in computer science, mathematics and physics can be reduced to one of the following computational tasks involving polynomials and set systems: given an oracle access to an m-variate real polynomial g and to a family of (multi-)subsets ℬ of [m], (1) compute the sum of coefficients of monomials in g corresponding to all the sets that appear in B(1), or find S ε ℬ such that the monomial in g corresponding to S has the largest coefficient in g. Special cases of these problems, such as computing permanents and mixed discriminants, sampling from determinantal point processes, and maximizing sub-determinants with combinatorial constraints have been topics of much recent interest in theoretical computer science.
We establish universal modified log-Sobolev inequalities for reversible Markov chains on the boolean lattice {0,1}n, when the invariant law π satisfies a form of negative dependence known as the stochastic … We establish universal modified log-Sobolev inequalities for reversible Markov chains on the boolean lattice {0,1}n, when the invariant law π satisfies a form of negative dependence known as the stochastic covering property. This condition is strictly weaker than the strong Rayleigh property, and is satisfied in particular by all determinantal measures, as well as the uniform distribution over the set of bases of any balanced matroid. In the special case where π is k-homogeneous, our results imply the celebrated concentration inequality for Lipschitz functions due to Pemantle and Peres (Combin. Probab. Comput. 23 (2014) 140–160). As another application, we deduce that the natural Monte-Carlo Markov chain used to sample from π has mixing time at most knloglog1π(x) when initialized in state x. To the best of our knowledge, this is the first work relating negative dependence and modified log-Sobolev inequalities.
Let $A \in Ω_n$ be doubly-stochastic $n \times n$ matrix. Alexander Schrijver proved in 1998 the following remarkable inequality per(\widetilde{A}) \geq \prod_{1 \leq i,j \leq n} (1- A(i,j)); \widetilde{A}(i,j) =: … Let $A \in Ω_n$ be doubly-stochastic $n \times n$ matrix. Alexander Schrijver proved in 1998 the following remarkable inequality per(\widetilde{A}) \geq \prod_{1 \leq i,j \leq n} (1- A(i,j)); \widetilde{A}(i,j) =: A(i,j)(1-A(i,j)), 1 \leq i,j \leq n. We use the above Shrijver's inequality to prove the following lower bound: \frac{per(A)}{F(A)} \geq 1; F(A) =: \prod_{1 \leq i,j \leq n} (1- A(i,j))^{1- A(i,j)}. We use this new lower bound to prove S.Friedland's Asymptotic Lower Matching Conjecture(LAMC) on monomer-dimer problem. We use some ideas of our proof of (LAMC) to disprove [Lu,Mohr,Szekely] positive correlation conjecture. We present explicit doubly-stochastic $n \times n$ matrices $A$ with the ratio $\frac{per(A)}{F(A)} = \sqrt{2}^{n}$; conjecture that \max_{A \in Ω_n}\frac{per(A)}{F(A)} \approx (\sqrt{2})^{n} and give some examples supporting the conjecture. If true, the conjecture (and other ones stated in the paper) would imply a deterministic poly-time algorithm to approximate the permanent of $n \times n$ nonnegative matrices within the relative factor $(\sqrt{2})^{n}$. The best current such factor is $e^n$.
Previous chapter Next chapter Full AccessProceedings Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)On Mixing of Markov Chains: Coupling, Spectral Independence, and Entropy FactorizationAntonio Blanca, Pietro Caputo, … Previous chapter Next chapter Full AccessProceedings Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)On Mixing of Markov Chains: Coupling, Spectral Independence, and Entropy FactorizationAntonio Blanca, Pietro Caputo, Zongchen Chen, Daniel Parisi, Daniel Štefankovič, and Eric VigodaAntonio Blanca, Pietro Caputo, Zongchen Chen, Daniel Parisi, Daniel Štefankovič, and Eric Vigodapp.3670 - 3692Chapter DOI:https://doi.org/10.1137/1.9781611977073.145PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract For general spin systems, we prove that a contractive coupling for an arbitrary local Markov chain implies optimal bounds on the mixing time and the modified log-Sobolev constant for a large class of Markov chains including the Glauber dynamics, arbitrary heat-bath block dynamics, and the Swendsen-Wang dynamics. This reveals a novel connection between probabilistic techniques for bounding the convergence to stationarity and analytic tools for analyzing the decay of relative entropy. As a corollary of our general results, we obtain O(n log n) mixing time and Ω(1/n) modified log-Sobolev constant of the Glauber dynamics for sampling random q-colorings of an n-vertex graph with constant maximum degree Δ when q > (11/6–∊0)Δ for some fixed ∊0 > 0. We also obtain O(log n) mixing time and Ω(1) modified log-Sobolev constant of the Swendsen-Wang dynamics for the ferromagnetic Ising model on an n-vertex graph of constant maximum degree when the parameters of the system lie in the tree uniqueness region. At the heart of our results are new techniques for establishing spectral independence of the spin system and block factorization of the relative entropy. On one hand we prove that a contractive coupling of any local Markov chain implies spectral independence of the Gibbs distribution. On the other hand we show that spectral independence implies factorization of entropy for arbitrary blocks, establishing optimal bounds on the modified log-Sobolev constant of the corresponding block dynamics. Previous chapter Next chapter RelatedDetails Published:2022eISBN:978-1-61197-707-3 https://doi.org/10.1137/1.9781611977073Book Series Name:ProceedingsBook Code:PRDA22Book Pages:xvii + 3771
We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem is polyloglog(n). In other words, there is a polynomial time algorithm that approximates … We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem is polyloglog(n). In other words, there is a polynomial time algorithm that approximates the value of the optimum tour within a factor of polyloglog(n), where polyloglog(n) is a bounded degree polynomial of loglog(n). We prove this by showing that any k-edge-connected unweighted graph has a polyloglog(n)/k-thin spanning tree. Our main new ingredient is a procedure, albeit an exponentially sized convex program, that "transforms" graphs that do not admit any spectrally thin trees into those that provably have spectrally thin trees. More precisely, given a k-edge-connected graph G = (V, E) where k ≥ 7 log(n), we show that there is a matrix D that "preserves" the structure of all cuts of G such that for a set F ⊆ E that induces an Ω(k)-edge-connected graph, the effective resistance of every edge in F w.r.t. D is at most polylog(k)/k. Then, we use our extension of the seminal work of Marcus, Spielman, and Srivastava [1], fully explained in [2], to prove the existence of a polylog(k)/k-spectrally thin tree with respect to D. Such a tree is polylog(k)/k-combinatorially thin with respect to G as D preserves the structure of cuts of G.
The spectral independence approach of Anari et al. (2020) utilized recent results on high-dimensional expanders of Alev and Lau (2020) and established rapid mixing of the Glauber dynamics for the … The spectral independence approach of Anari et al. (2020) utilized recent results on high-dimensional expanders of Alev and Lau (2020) and established rapid mixing of the Glauber dynamics for the hard-core model defined on weighted independent sets. We develop the spectral independence approach for colorings, and obtain new algorithmic results for the corresponding counting/sampling problems.Let α∗ ≈ 1.763 denote the solution to exp(1/x) = x and let α > α∗. We prove that, for any triangle-free graph G = (V, E) with maximum degree Δ, for all q ≥ αΔ + 1, the mixing time of the Glauber dynamics for q-colorings is polynomial in n = |V|, with the exponent of the polynomial independent of Δ and q. In comparison, previous approximate counting results for colorings held for a similar range of q (asymptotically in Δ) but with larger girth requirement or with a running time where the polynomial exponent depended on Δ and q (exponentially). One further feature of using the spectral independence approach to study colorings is that it avoids many of the technical complications in previous approaches caused by coupling arguments or by passing to the complex plane; the key improvement on the running time is based on relatively simple combinatorial arguments which are then translated into spectral bounds.
Abstract We give a fully polynomial-time randomized approximation scheme (FPRAS) for the number of bases in bicircular matroids. This is a natural class of matroids for which counting bases exactly … Abstract We give a fully polynomial-time randomized approximation scheme (FPRAS) for the number of bases in bicircular matroids. This is a natural class of matroids for which counting bases exactly is # P -hard and yet approximate counting can be done efficiently.
We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in (n5/3 m1/3) time. The tree is sampled from a distribution where … We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in (n5/3 m1/3) time. The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, O(nω). For the special case of unweighted graphs, this improves upon the best previously known running time of Õ(min{nω,m√n,m4/3}) for m ⪢ n7/4 (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15).
We give a probabilistic introduction to determinantal and permanental point processes. Determinantal processes arise in physics (fermions, eigenvalues of random matrices) and in combinatorics (nonintersecting paths, random spanning trees). They … We give a probabilistic introduction to determinantal and permanental point processes. Determinantal processes arise in physics (fermions, eigenvalues of random matrices) and in combinatorics (nonintersecting paths, random spanning trees). They have the striking property that the number of points in a region D is a sum of independent Bernoulli random variables, with parameters which are eigenvalues of the relevant operator on L2(D). Moreover, any determinantal process can be represented as a mixture of determinantal projection processes. We give a simple explanation for these known facts, and establish analogous representations for permanental processes, with geometric variables replacing the Bernoulli variables. These representations lead to simple proofs of existence criteria and central limit theorems, and unify known results on the distribution of absolute values in certain processes with radially symmetric distributions.
We construct a quantum-inspired classical algorithm for computing the permanent of Hermitian positive semidefinite matrices, by exploiting a connection between these mathematical structures and the boson sampling model. Specifically, the … We construct a quantum-inspired classical algorithm for computing the permanent of Hermitian positive semidefinite matrices, by exploiting a connection between these mathematical structures and the boson sampling model. Specifically, the permanent of a Hermitian positive semidefinite matrix can be expressed in terms of the expected value of a random variable, which stands for a specific photon-counting probability when measuring a linear-optically evolved random multimode coherent state. Our algorithm then approximates the matrix permanent from the corresponding sample mean and is shown to run in polynomial time for various sets of Hermitian positive semidefinite matrices, achieving a precision that improves over known techniques. This work illustrates how quantum optics may benefit algorithms development.
Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy the strongest form of negative dependence properties. We show that the “natural” Monte Carlo Markov Chain … Strongly Rayleigh distributions are natural generalizations of product and determinantal probability distributions and satisfy the strongest form of negative dependence properties. We show that the “natural” Monte Carlo Markov Chain (MCMC) algorithm mixes rapidly in the support of a homogeneous strongly Rayleigh distribution. As a byproduct, our proof implies Markov chains can be used to efficiently generate approximate samples of a k-determinantal point process. This answers an open question raised by Deshpande and Rademacher (2010) which was studied recently by Kang (2013); Li et al. (2015); Rebeschini and Karbasi (2015).
For some positive constant ϵ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> , we give a (3/2-ϵ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> )-approximation algorithm for the following problem: given a graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> = … For some positive constant ϵ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> , we give a (3/2-ϵ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> )-approximation algorithm for the following problem: given a graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> = (V,V <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> ), find the shortest tour that visits every vertex at least once. This is a special case of the metric traveling salesman problem when the underlying metric is defined by shortest path distances in Go. The result improves on the 3/2-approximation algorithm due to Christofides [13] for this special case. Similar to Christofides, our algorithm finds a spanning tree whose cost is upper bounded by the optimum, then it finds the minimum cost Eulerian augmentation (or T-join) of that tree. The main difference is in the selection of the spanning tree. Except in certain cases where the solution of LP is nearly integral, we select the spanning tree randomly by sampling from a maximum entropy distribution defined by the linear programming relaxation. Despite the simplicity of the algorithm, the analysis builds on a variety of ideas such as properties of strongly Rayleigh measures from probability theory, graph theoretical results on the structure of near minimum cuts, and the integrality of the T-join polytope from polyhedral theory. Also, as a byproduct of our result, we show new properties of the near minimum cuts of any graph, which may be of independent interest.
Catalan numbers arise in many enumerative contexts as the counting sequence of combinatorial structures. In this work, we consider natural Markov chains on some of the realizations of the Catalan … Catalan numbers arise in many enumerative contexts as the counting sequence of combinatorial structures. In this work, we consider natural Markov chains on some of the realizations of the Catalan sequence. While our main result is in deriving an O(n 2 log n) bound on the mixing time in L2 (and hence total variation) distance for the random transposition chain on Dyck paths, we raise several open questions, including the optimality of the above bound. The novelty in our proof is in establishing a certain negative correlation property among random bases of lattice path matroids, including the so-called Catalan matroid which can be defined using Dyck paths.
Friedland's Lower Matching Conjecture asserts that if $G$ is a $d$--regular bipartite graph on $v(G)=2n$ vertices, and $m_k(G)$ denotes the number of matchings of size $k$, then $$m_k(G)\geq {n \choose … Friedland's Lower Matching Conjecture asserts that if $G$ is a $d$--regular bipartite graph on $v(G)=2n$ vertices, and $m_k(G)$ denotes the number of matchings of size $k$, then $$m_k(G)\geq {n \choose k}^2\left(\frac{d-p}{d}\right)^{n(d-p)}(dp)^{np},$$ where $p=\frac{k}{n}$. When $p=1$, this conjecture reduces to a theorem of Schrijver which says that a $d$--regular bipartite graph on $v(G)=2n$ vertices has at least $$\left(\frac{(d-1)^{d-1}}{d^{d-2}}\right)^n$$ perfect matchings. L. Gurvits proved an asymptotic version of the Lower Matching Conjecture, namely he proved that $$\frac{\ln m_k(G)}{v(G)}\geq \frac{1}{2}\left(p\ln \left(\frac{d}{p}\right)+(d-p)\ln \left(1-\frac{p}{d}\right)-2(1-p)\ln (1-p)\right)+o_{v(G)}(1).$$ In this paper, we prove the Lower Matching Conjecture. In fact, we will prove a slightly stronger statement which gives an extra $c_p\sqrt{n}$ factor compared to the conjecture if $p$ is separated away from $0$ and $1$, and is tight up to a constant factor if $p$ is separated away from $1$. We will also give a new proof of Gurvits's and Schrijver's theorems, and we extend these theorems to $(a,b)$--biregular bipartite graphs.
We consider finite-state Markov chains that can be naturally decomposed into smaller “projection” and “restriction” chains. Possibly this decomposition will be inductive, in that the restriction chains will be smaller … We consider finite-state Markov chains that can be naturally decomposed into smaller “projection” and “restriction” chains. Possibly this decomposition will be inductive, in that the restriction chains will be smaller copies of the initial chain. We provide expressions for Poincaré (resp. log-Sobolev) constants of the initial Markov chain in terms of Poincaré (resp. log-Sobolev) constants of the projection and restriction chains, together with further a parameter. In the case of the Poincaré constant, our bound is always at least as good as existing ones and, depending on the value of the extra parameter, may be much better. There appears to be no previously published decomposition result for the log-Sobolev constant. Our proofs are elementary and self-contained.
Article Free Access Share on Generating random spanning trees more quickly than the cover time Author: David Bruce Wilson Department of Mathematics, Laboratory for Computer Science, Massachusetts Institute of Technology, … Article Free Access Share on Generating random spanning trees more quickly than the cover time Author: David Bruce Wilson Department of Mathematics, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts Department of Mathematics, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MassachusettsView Profile Authors Info & Claims STOC '96: Proceedings of the twenty-eighth annual ACM symposium on Theory of ComputingJuly 1996Pages 296–303https://doi.org/10.1145/237814.237880Published:01 July 1996Publication History 216citation2,127DownloadsMetricsTotal Citations216Total Downloads2,127Last 12 Months340Last 6 weeks32 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteeReaderPDF
Convex optimization problems arise frequently in many different fields. A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. … Convex optimization problems arise frequently in many different fields. A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. The focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. The text contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance, and economics.
Subset selection problems ask for a small, diverse yet representative subset of the given data. When pairwise similarities are captured by a kernel, the determinants of submatrices provide a measure … Subset selection problems ask for a small, diverse yet representative subset of the given data. When pairwise similarities are captured by a kernel, the determinants of submatrices provide a measure of diversity or independence of items within a subset. Matroid theory gives another notion of independence, thus giving rise to optimization and sampling questions about Determinantal Point Processes (DPPs) under matroid constraints. Partition constraints, as a special case, arise naturally when incorporating additional labeling or clustering information, besides the kernel, in DPPs. Finding the maximum determinant submatrix under matroid constraints on its row/column indices has been previously studied. However, the corresponding question of sampling from DPPs under matroid constraints has been unresolved, beyond the simple cardinality constrained k-DPPs. We give the first polynomial time algorithm to sample exactly from DPPs under partition constraints, for any constant number of partitions. We complement this by a complexity theoretic barrier that rules out such a result under general matroid constraints. Our experiments indicate that partition-constrained DPPs offer more flexibility and more diversity than k-DPPs and their naive extensions, while being reasonably efficient in running time. We also show that a simple greedy initialization followed by local search gives improved approximation guarantees for the problem of MAP inference from k- DPPs on well-conditioned kernels. Our experiments show that this improvement is significant for larger values of k, supporting our theoretical result.