Positive Definite Matrices

Authors

Type: Book
Publication Date: 2009-12-31
Citations: 1098
DOI: https://doi.org/10.1515/9781400827787

Abstract

This book represents the first synthesis of the considerable body of new research into positive definite matrices. These matrices play the same role in noncommutative analysis as positive real numbers do in classical analysis. They have theoretical and computational uses across a broad spectrum of disciplines, including calculus, electrical engineering, statistics, physics, numerical analysis, quantum information theory, and geometry. Through detailed explanations and an authoritative and inspiring writing style, Rajendra Bhatia carefully develops general techniques that have wide applications in the study of such matrices. Bhatia introduces several key topics in functional analysis, operator theory, harmonic analysis, and differential geometry--all built around the central theme of positive definite matrices. He discusses positive and completely positive linear maps, and presents major theorems with simple and direct proofs. He examines matrix means and their applications, and shows how to use positive definite functions to derive operator inequalities that he and others proved in recent years. He guides the reader through the differential geometry of the manifold of positive definite matrices, and explains recent work on the geometric mean of several matrices. Positive Definite Matrices is an informative and useful reference book for mathematicians and other researchers and practitioners. The numerous exercises and notes at the end of each chapter also make it the ideal textbook for graduate-level courses.

Locations

  • Princeton University Press eBooks
A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content. A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.
Linear algebra is a fundamental tool in many fields, including mathematics and statistics, computer science, economics, and the physical and biological sciences. This undergraduate textbook offers a complete second course … Linear algebra is a fundamental tool in many fields, including mathematics and statistics, computer science, economics, and the physical and biological sciences. This undergraduate textbook offers a complete second course in linear algebra, tailored to help students transition from basic theory to advanced topics and applications. Concise chapters promote a focused progression through essential ideas, and contain many examples and illustrative graphics. In addition, each chapter contains a bullet list summarising important concepts, and the book includes over 600 exercises to aid the reader's understanding. Topics are derived and discussed in detail, including the singular value decomposition, the Jordan canonical form, the spectral theorem, the QR factorization, normal matrices, Hermitian matrices (of interest to physics students), and positive definite matrices (of interest to statistics students).
(1970). Positive Definite Matrices. The American Mathematical Monthly: Vol. 77, No. 3, pp. 259-264. (1970). Positive Definite Matrices. The American Mathematical Monthly: Vol. 77, No. 3, pp. 259-264.
The study of positive-definite matrices has focused on Hermitian matrices, that is, square matrices with complex (or real) entries that are equal to their own conjugate transposes. In the classical … The study of positive-definite matrices has focused on Hermitian matrices, that is, square matrices with complex (or real) entries that are equal to their own conjugate transposes. In the classical setting, positive-definite matrices enjoy a multitude of equivalent definitions and properties. We investigate when a square, symmetric matrix with entries coming from a finite field can be called "positive-definite" and discuss which of the classical equivalences and implications carry over.
We construct several examples of positive definite functions, and use the positive definite matrices arising from them to derive several inequalities for norms of operators. 1991 Mathematics Subject Classification 42A82, … We construct several examples of positive definite functions, and use the positive definite matrices arising from them to derive several inequalities for norms of operators. 1991 Mathematics Subject Classification 42A82, 47A63, 15A45, 15A60.
Totally positive matrices constitute a particular class of matrices, the study of which was initiated by analysts because of its many applications in diverse areas. This account of the subject … Totally positive matrices constitute a particular class of matrices, the study of which was initiated by analysts because of its many applications in diverse areas. This account of the subject is comprehensive and thorough, with careful treatment of the central properties of totally positive matrices, full proofs and a complete bibliography. The history of the subject is also described: in particular, the book ends with a tribute to the four people who have made the most notable contributions to the history of total positivity: I. J. Schoenberg, M. G. Krein, F. R. Gantmacher and S. Karlin. This monograph will appeal to those with an interest in matrix theory, to those who use or have used total positivity, and to anyone who wishes to learn about this rich and interesting subject.
A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content. A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.
A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content. A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.
This work is dedicated to the study of the set of positive definite symmetric matrices as an example of a Bruhat – Tits space. We prove that as a metric … This work is dedicated to the study of the set of positive definite symmetric matrices as an example of a Bruhat – Tits space. We prove that as a metric space it is complete and obeys the semi-parallelogram law. We also study the exponential matrix function, defined in the set of symmetric matrices. In the first chapter, we state the semi-parallelogram law and define Bruhat – Tits metric spaces. Furthermore, we give a fixed point theorem and present two proofs; the first by Bruhat and Tits, whereas the second was given by Serre. In the second chapter we study positive definite matrices and define the trace scalar product on symmetric matrices. Afterwards we state the metric increasing property of exponential function. We conclude the chapter by proving the basic theorem that positive definite symmetric matrices constitute a Bruhat – Tits space. In the third chapter, we prove the theorem of metric semi-increasing property of exponential map, as stated in the second chapter. In the introduction we provide the necessary background and results – most of which without proof – that are useful for our main work, with emphasis on the exponential function, since it is a fundamental tool of our work. Finally, in the appendix, we briefly discuss the space of positive definite symmetric matrices as a Riemannian manifold and we state an equivalent condition theorem from which we can conclude that this space has a semi-negative curvature.
A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content. A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.
For q-dimensional data, penalized versions of the sample covariance matrix are important when the sample size is small or modest relative to q. Since the negative log-likelihood under multivariate normal … For q-dimensional data, penalized versions of the sample covariance matrix are important when the sample size is small or modest relative to q. Since the negative log-likelihood under multivariate normal sampling is convex in Σ−1, the inverse of the covariance matrix, it is common to consider additive penalties which are also convex in Σ−1. More recently, Deng and Tsui and Yu et al. have proposed penalties which are strictly functions of the roots of Σ and are convex in log Σ, but not in Σ−1. The resulting penalized optimization problems, though, are neither convex in log Σ nor in Σ−1. In this article, however, we show these penalized optimization problems to be geodesically convex in Σ. This allows us to establish the existence and uniqueness of the corresponding penalized covariance matrices. More generally, we show that geodesic convexity in Σ is equivalent to convexity in log Σ for penalties which are functions of the roots of Σ. In addition, when using such penalties, the resulting penalized optimization problem reduces to a q-dimensional convex optimization problem on the logs of the roots of Σ, which can then be readily solved via Newton's algorithm. Supplementary materials for this article are available online.
Given two symmetric and positive semidefinite square matrices <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A comma upper B"> <mml:semantics> <mml:mrow> <mml:mi>A</mml:mi> <mml:mo>,</mml:mo> <mml:mi>B</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">A, B</mml:annotation> </mml:semantics> </mml:math> </inline-formula>, is … Given two symmetric and positive semidefinite square matrices <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A comma upper B"> <mml:semantics> <mml:mrow> <mml:mi>A</mml:mi> <mml:mo>,</mml:mo> <mml:mi>B</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">A, B</mml:annotation> </mml:semantics> </mml:math> </inline-formula>, is it true that any matrix given as the product of <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="m"> <mml:semantics> <mml:mi>m</mml:mi> <mml:annotation encoding="application/x-tex">m</mml:annotation> </mml:semantics> </mml:math> </inline-formula> copies of <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A"> <mml:semantics> <mml:mi>A</mml:mi> <mml:annotation encoding="application/x-tex">A</mml:annotation> </mml:semantics> </mml:math> </inline-formula> and <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="n"> <mml:semantics> <mml:mi>n</mml:mi> <mml:annotation encoding="application/x-tex">n</mml:annotation> </mml:semantics> </mml:math> </inline-formula> copies of <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper B"> <mml:semantics> <mml:mi>B</mml:mi> <mml:annotation encoding="application/x-tex">B</mml:annotation> </mml:semantics> </mml:math> </inline-formula> in a particular sequence must be dominated in the spectral norm by the ordered matrix product <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A Superscript m Baseline upper B Superscript n"> <mml:semantics> <mml:mrow> <mml:msup> <mml:mi>A</mml:mi> <mml:mi>m</mml:mi> </mml:msup> <mml:msup> <mml:mi>B</mml:mi> <mml:mi>n</mml:mi> </mml:msup> </mml:mrow> <mml:annotation encoding="application/x-tex">A^m B^n</mml:annotation> </mml:semantics> </mml:math> </inline-formula>? For example, is <disp-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="double-vertical-bar upper A upper A upper B upper A upper A upper B upper A upper B upper B double-vertical-bar less-than-or-equal-to double-vertical-bar upper A upper A upper A upper A upper A upper B upper B upper B upper B double-vertical-bar question-mark"> <mml:semantics> <mml:mrow> <mml:mo fence="false" stretchy="false">‖</mml:mo> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>B</mml:mi> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>B</mml:mi> <mml:mi>A</mml:mi> <mml:mi>B</mml:mi> <mml:mi>B</mml:mi> <mml:mo fence="false" stretchy="false">‖</mml:mo> <mml:mo>≤</mml:mo> <mml:mo fence="false" stretchy="false">‖</mml:mo> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>A</mml:mi> <mml:mi>B</mml:mi> <mml:mi>B</mml:mi> <mml:mi>B</mml:mi> <mml:mi>B</mml:mi> <mml:mo fence="false" stretchy="false">‖</mml:mo> <mml:mo>?</mml:mo> </mml:mrow> <mml:annotation encoding="application/x-tex">\begin{equation*} \| AABAABABB \| \leq \| AAAAABBBB \| ? \end{equation*}</mml:annotation> </mml:semantics> </mml:math> </disp-formula> Drury [Electron J. Linear Algebra 18 (2009), pp. 13–20] has characterized precisely which disordered words have the property that an inequality of this type holds for all matrices <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A comma upper B"> <mml:semantics> <mml:mrow> <mml:mi>A</mml:mi> <mml:mo>,</mml:mo> <mml:mi>B</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">A,B</mml:annotation> </mml:semantics> </mml:math> </inline-formula>. However, the <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="1"> <mml:semantics> <mml:mn>1</mml:mn> <mml:annotation encoding="application/x-tex">1</mml:annotation> </mml:semantics> </mml:math> </inline-formula>-parameter family of counterexamples Drury constructs for these characterizations is comprised of <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="3 times 3"> <mml:semantics> <mml:mrow> <mml:mn>3</mml:mn> <mml:mo>×</mml:mo> <mml:mn>3</mml:mn> </mml:mrow> <mml:annotation encoding="application/x-tex">3 \times 3</mml:annotation> </mml:semantics> </mml:math> </inline-formula> matrices, and thus as stated the characterization applies only for <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper N times upper N"> <mml:semantics> <mml:mrow> <mml:mi>N</mml:mi> <mml:mo>×</mml:mo> <mml:mi>N</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">N \times N</mml:annotation> </mml:semantics> </mml:math> </inline-formula> matrices with <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper N greater-than-or-equal-to 3"> <mml:semantics> <mml:mrow> <mml:mi>N</mml:mi> <mml:mo>≥</mml:mo> <mml:mn>3</mml:mn> </mml:mrow> <mml:annotation encoding="application/x-tex">N \geq 3</mml:annotation> </mml:semantics> </mml:math> </inline-formula>. In contrast, we prove that for <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="2 times 2"> <mml:semantics> <mml:mrow> <mml:mn>2</mml:mn> <mml:mo>×</mml:mo> <mml:mn>2</mml:mn> </mml:mrow> <mml:annotation encoding="application/x-tex">2 \times 2</mml:annotation> </mml:semantics> </mml:math> </inline-formula> matrices, the general rearrangement inequality holds for all disordered words. We also show that for larger <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper N times upper N"> <mml:semantics> <mml:mrow> <mml:mi>N</mml:mi> <mml:mo>×</mml:mo> <mml:mi>N</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">N \times N</mml:annotation> </mml:semantics> </mml:math> </inline-formula> matrices, the general rearrangement inequality holds for all disordered words for most <inline-formula content-type="math/mathml"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper A comma upper B"> <mml:semantics> <mml:mrow> <mml:mi>A</mml:mi> <mml:mo>,</mml:mo> <mml:mi>B</mml:mi> </mml:mrow> <mml:annotation encoding="application/x-tex">A,B</mml:annotation> </mml:semantics> </mml:math> </inline-formula> (in a sense of full measure) that are sufficiently small perturbations of the identity.
Abstract In this article, two inequalities related to $2\times 2$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mn>2</mml:mn><mml:mo>×</mml:mo><mml:mn>2</mml:mn></mml:math> block sector partial transpose matrices are proved, and we also present a unitarily invariant norm inequality for the … Abstract In this article, two inequalities related to $2\times 2$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mn>2</mml:mn><mml:mo>×</mml:mo><mml:mn>2</mml:mn></mml:math> block sector partial transpose matrices are proved, and we also present a unitarily invariant norm inequality for the Hua matrix which is sharper than an existing result.
For a matrix $X\in \mathbb {R}^{n\times p}$, we provide an analytic formula that keeps track of an orthonormal basis for the range of $X$ under rank-one modifications. More precisely, we … For a matrix $X\in \mathbb {R}^{n\times p}$, we provide an analytic formula that keeps track of an orthonormal basis for the range of $X$ under rank-one modifications. More precisely, we consider rank-one adaptations $X_{new} = X+ab^T$ of a given $X$ with known matrix factorization $X = UW$, where $U\in \mathbb {R}^{n\times p}$ is column-orthogonal and $W\in \mathbb {R}^{p\times p}$ is invertible. Arguably, the most important methods that produce such factorizations are the singular value decomposition (SVD), where $X=UW=U(\Sigma V^T)$, and the QR-decomposition, where $X = UW = QR$. We give a geometric description of rank-one adaptations and derive a closed-form expression for the geodesic line that travels from the subspace $\mathcal {S}= \textrm {{ran}}(X)$ to the subspace $\mathcal {S}_{new} =\textrm {{ran}}(X_{new}) =\textrm {{ran}}(U_{new}W_{new})$. This leads to update formulas for orthogonal matrix decompositions, where both $U_{new}$ and $W_{new}$ are obtained via elementary rank-one matrix updates in $\mathcal {O}(np)$ time for $n\gg p$. Moreover, this allows us to determine the subspace distance and the Riemannian midpoint between the subspaces $\mathcal {S}$ and $\mathcal {S}_{new}$ without additional computational effort.
Many statistical settings call for estimating a population parameter, most typically the population mean, based on a sample of matrices. The most natural estimate of the population mean is the … Many statistical settings call for estimating a population parameter, most typically the population mean, based on a sample of matrices. The most natural estimate of the population mean is the arithmetic mean, but there are many other matrix means that may behave differently, especially in high dimensions. Here we consider the matrix harmonic mean as an alternative to the arithmetic matrix mean. We show that in certain high-dimensional regimes, the harmonic mean yields an improvement over the arithmetic mean in estimation error as measured by the operator norm. Counter-intuitively, studying the asymptotic behavior of these two matrix means in a spiked covariance estimation problem, we find that this improvement in operator norm error does not imply better recovery of the leading eigenvector. We also show that a Rao-Blackwellized version of the harmonic mean is equivalent to a linear shrinkage estimator studied previously in the high-dimensional covariance estimation literature, while applying a similar Rao-Blackwellization to regularized sample covariance matrices yields a novel nonlinear shrinkage estimator. Simulations complement the theoretical results, illustrating the conditions under which the harmonic matrix mean yields an empirically better estimate.
This paper's main objective is to find new upper bounds for the norm of the sum of two Hilbert space operators and their Kronecker product. The obtained results extend some … This paper's main objective is to find new upper bounds for the norm of the sum of two Hilbert space operators and their Kronecker product. The obtained results extend some previously known results from the set of positive operators to arbitrary ones and refine several existing bounds. In particular, applications of the established bounds include refining celebrated numerical radius inequalities and the celebrated operator Cauchy–Schwarz norm inequality.
Covariance matrices have found success in several computer vision applications, including activity recognition, visual surveillance, and diffusion tensor imaging. This is because they provide an easy platform for fusing multiple … Covariance matrices have found success in several computer vision applications, including activity recognition, visual surveillance, and diffusion tensor imaging. This is because they provide an easy platform for fusing multiple features compactly. An important task in all of these applications is to compare two covariance matrices using a (dis)similarity function, for which the common choice is the Riemannian metric on the manifold inhabited by these matrices. As this Riemannian manifold is not flat, the dissimilarities should take into account the curvature of the manifold. As a result, such distance computations tend to slow down, especially when the matrix dimensions are large or gradients are required. Further, suitability of the metric to enable efficient nearest neighbor retrieval is an important requirement in the contemporary times of big data analytics. To alleviate these difficulties, this paper proposes a novel dissimilarity measure for covariances, the Jensen-Bregman LogDet Divergence (JBLD). This divergence enjoys several desirable theoretical properties and at the same time is computationally less demanding (compared to standard measures). Utilizing the fact that the square root of JBLD is a metric, we address the problem of efficient nearest neighbor retrieval on large covariance datasets via a metric tree data structure. To this end, we propose a K-Means clustering algorithm on JBLD. We demonstrate the superior performance of JBLD on covariance datasets from several computer vision applications.
We analyse linear maps of operator algebras mapping the set of rank-k projectors onto the set of rank-l projectors surjectively. A complete characterisation of such maps for prime is provided. … We analyse linear maps of operator algebras mapping the set of rank-k projectors onto the set of rank-l projectors surjectively. A complete characterisation of such maps for prime is provided. A particular case corresponding to is well known as Wigner's theorem. Hence our result may be considered as a generalisation of this celebrated Wigner's result.
The auxiliary function of a classical channel appears in two fundamental quantities, the random coding exponent and the sphere-packing exponent, which yield upper and lower bounds on the error probability … The auxiliary function of a classical channel appears in two fundamental quantities, the random coding exponent and the sphere-packing exponent, which yield upper and lower bounds on the error probability of decoding, respectively. A crucial property of the auxiliary function is its concavity, and this property consequently leads to several important results in finite blocklength analysis. In this paper, we prove that the auxiliary function of a classical-quantum channel also enjoys the same concavity property, extending an earlier partial result to its full generality. We also prove that the auxiliary function satisfies the data-processing inequality, among various other important properties. Furthermore, we show that the concavity property of the auxiliary function enables a geometric interpretation of the random coding exponent and the sphere-packing exponent of a classical-quantum channel. The key component in our proof is an important result from the theory of matrix geometric means.
In the context of control and estimation under information constraints, restoration entropy measures the minimal required data rate above which the state of a system can be estimated so that … In the context of control and estimation under information constraints, restoration entropy measures the minimal required data rate above which the state of a system can be estimated so that the estimation quality does not degrade over time and, conversely, can be improved. The remote observer here is assumed to receive its data through a communication channel of finite bit-rate capacity. In this paper, we provide a new characterization of the restoration entropy which does not require to compute any temporal limit, i.e., an asymptotic quantity. Our new formula is based on the idea of finding a specific Riemannian metric on the state space which makes the metric-dependent upper estimate of the restoration entropy as tight as one wishes.
We present a systematic study of the geometric Renyi divergence (GRD), also known as the maximal Renyi divergence, from the point of view of quantum information theory. We show that … We present a systematic study of the geometric Renyi divergence (GRD), also known as the maximal Renyi divergence, from the point of view of quantum information theory. We show that this divergence, together with its extension to channels, has many appealing structural properties. For example we prove a chain rule inequality that immediately implies the amortization collapse for the geometric Renyi divergence, addressing an open question by Berta et al. [arXiv:1808.01498, Equation (55)] in the area of quantum channel discrimination. As applications, we explore various channel capacity problems and construct new channel information measures based on the geometric Renyi divergence, sharpening the previously best-known bounds based on the max-relative entropy while still keeping the new bounds single-letter efficiently computable. A plethora of examples are investigated and the improvements are evident for almost all cases.
In this paper, we prove the convexity of trace functionals $$(A,B,C)\mapsto \text{Tr}|B^{p}AC^{q}|^{s},$$ for parameters $(p,q,s)$ that are best possible, where $B$ and $C$ are any $n$-by-$n$ positive definite matrices, and … In this paper, we prove the convexity of trace functionals $$(A,B,C)\mapsto \text{Tr}|B^{p}AC^{q}|^{s},$$ for parameters $(p,q,s)$ that are best possible, where $B$ and $C$ are any $n$-by-$n$ positive definite matrices, and $A$ is any $n$-by-$n$ matrix. We also obtain the monotonicity versions of trace functionals of this type. As applications, we extend some results in \cite{HP12quasi,CFL16some} and resolve a conjecture in \cite{RZ14} in the matrix setting. Other conjectures in \cite{RZ14} will also be discussed. We also show that some related trace functionals are not concave in general. Such concavity results were expected to hold in different problems.
In this paper, we analyze the process of “assembling” new matrix geometric means from existing ones, through function composition or limit processes. We show that for n = 4 a … In this paper, we analyze the process of “assembling” new matrix geometric means from existing ones, through function composition or limit processes. We show that for n = 4 a new matrix mean exists which is simpler to compute than the existing ones. Moreover, we show that for n &gt; 4 the existing proving strategies cannot provide a mean computationally simpler than the existing ones.