Accelerating scientific computations with mixed precision algorithms

Marc Baboulin, Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Łuszczek, Stanimire Tomov

Type: Article

Publication Date: 2008-11-14

Citations: 177

DOI: https://doi.org/10.1016/j.cpc.2008.11.005

View Publication

Download PDF

Abstract

On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes

Locations

Computer Physics Communications - View
arXiv (Cornell University) - View - PDF
Estudo Geral (Universidade de Coimbra) - View - PDF
Portuguese National Funding Agency for Science, Research and Technology (RCAAP Project by FCT) - View - PDF
HAL (Le Centre pour la Communication Scientifique Directe) - View
DataCite API - View

Similar Works

Action	Title	Year	Authors
+	A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic	2020	Ahmad Abdelfattah Hartwig Anzt Erik G. Boman Erin Carson Terry Cojean Jack Dongarra Mark Gates Thomas Grützmacher Nicholas J. Higham Xiaoye Sherry Li
+	A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic	2020	Ahmad Abdelfattah Hartwig Anzt Erik G. Boman Erin Carson Terry Cojean Jack Dongarra Mark Gates Thomas Grützmacher Nicholas J. Higham Xiaoye Sherry Li
+	MPCR: Multi- And Mixed-Precision Computations	2024	David Helmy
+	Peer Review #1 of "Performance impact of precision reduction in sparse linear systems solvers (v0.2)"	2022	Nicholas J. Higham Craig Lucas Françoise Tisseur Mawussi Zounon
+	Peer Review #2 of "Performance impact of precision reduction in sparse linear systems solvers (v0.2)"	2022	Takeshi Nanri
+	Peer Review #1 of "Performance impact of precision reduction in sparse linear systems solvers (v0.1)"	2022	Nicholas J. Higham Craig Lucas Françoise Tisseur Mawussi Zounon
+	Peer Review #2 of "Performance impact of precision reduction in sparse linear systems solvers (v0.1)"	2022	Takeshi Nanri
+ PDF Chat	Mixed-precision numerics in scientific applications: survey and perspectives	2024	Aditya Kashi Hao Lü Wesley Brewer David Rogers Michael J. Matheson Mallikarjun Shankar Feiyi Wang
+	Performance Evaluation of Mixed Precision Algorithms for Solving Sparse Linear Systems	2020	Mawussi Zounon Nicholas J. Higham Craig Lucas Françoise Tisseur
+	The Impact of Multicore on Math Software and Exploiting Single Precision Computing to Obtain Double Precision Results	2006	Jack Dongarra
+ PDF Chat	Fast and reliable solutions for numerical linear algebra solvers in high-performance computing.	2012	Marc Baboulin
+	Numerical methods for the accelerated resolution of large scale linear systems on massively parallel hybrid architecture	2015	Abal‐Kassim Cheik Ahamed
+	Performance Enhancement Strategies for Sparse Matrix-Vector Multiplication (SpMV) and Iterative Linear Solvers	2022	Thaha Mohammed Rashid Mehmood
+	Mixed Precision Numerical Linear Algebra (Final Report)	2022	Erin Carson
+	Mixed-Precision Schemes for Linear Algebra Kernels on GPUs.	2021	Gordon Euhyun Moon Sivasankaran Rajamanickam
+	A Study of Mixed Precision Strategies for GMRES on GPUs	2021	Jennifer Loe Christian Glusa Ichitaro Yamazaki Erik G. Boman Sivasankaran Rajamanickam
+	Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra	2020	Mohammadreza Soltaniyeh Richard P. Martin Santosh Nagarakatte
+	Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra	2020	Mohammadreza Soltaniyeh Richard P. Martin Santosh Nagarakatte
+	Sparse Linear System Solvers on GPUs: Parallel Preconditioning, Workload Balancing, and Communication Reduction	2019	Goran Flegar
+	Mixed precision iterative solver for the solution of large systems of linear equations in electromagnetic exploration	2016	Samuel Rodríguez Mauricio Hanzich Vladimir Puzyrev Santiago Castaño Fernández

Works That Cite This (48)

Action	Title	Year	Authors
+ PDF Chat	The Impact of Compilation Flags and Choosing Single- or Double-Precision Variables in Linear Systems Solvers	2023	Rafaela C. Brum Maria Clicia Castro Cristiane Oliveira de Faria
+ PDF Chat	Algebraic Error Analysis for Mixed-Precision Multigrid Solvers	2021	Stephen F. McCormick Joseph Benzaken Rasmus Tamstorf
+ PDF Chat	QRkit: Sparse, Composable QR Decompositions for Efficient and Stable Solutions to Problems in Computer Vision	2018	Jan Svoboda Thomas J. Cashman Andrew Fitzgibbon
+ PDF Chat	Combining Sparse Approximate Factorizations with Mixed-precision Iterative Refinement	2023	Patrick Amestoy Alfredo Buttari Nicholas J. Higham Jean-Yves L’Excellent Théo Mary Bastien Vieublé
+ PDF Chat	Mixed‐Precision for Linear Solvers in Global Geophysical Flows	2022	Jan Ackmann Peter Dueben T. N. Palmer Piotr K. Smolarkiewicz
+ PDF Chat	GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement	2012	Hartwig Anzt Piotr Łuszczek Jack Dongarra Vincent Heuveline
+ PDF Chat	Efficient implementation of symplectic implicit Runge-Kutta schemes with simplified Newton iterations	2017	Mikel Antoñana Joseba Makazaga Ander Murua
+	Iterative diagonalization of symmetric matrices in mixed precision	2011	Eiji Tsuchida Yoong‐Kee Choe
+	Mixed-Precision Parallel Linear Programming Solver	2010	Mujahed Eleyat Lasse Natvig
+	Leveraging Mixed Precision in Exponential Time Integration Methods	2023	Cody J. Balos Steven Roberts David J. Gardner

Works Cited by This (21)

Action	Title	Year	Authors
+	Using FGMRES to obtain backward stability in mixed precision	2008	M. Arioli IS Duff
+	Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy	2006	Julie Langou Julien Langou Piotr Łuszczek Jakub Kurzak Alfredo Buttari Jack Dongarra
+ PDF Chat	Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods	1994	Richard Frederick Barrett Michael Berry Tony F. Chan James Demmel J.M. Donato Jack Dongarra Victor Eijkhout Roldan Pozo C.H. Romine Henk van der Vorst
+	A column pre-ordering strategy for the unsymmetric-pattern multifrontal method	2004	Timothy A. Davis
+	Numerical methods for least squares problems	1996	Åke Björck
+ PDF Chat	Historical Development of the Newton–Raphson Method	1995	Tjalling Ypma
+	Progress in Sparse Matrix Methods for Large Linear Systems On Vector Supercomputers	1987	Cleve Ashcraft Roger G. Grimes John G. Lewis Barry W. Peyton Horst D. Simon Petter E. Bjørstad
+	Inexact Preconditioned Conjugate Gradient Method with Inner-Outer Iteration	1999	Gene H. Golub Qiang Ye
+ PDF Chat	Improving the Accuracy of Computed Singular Values	1983	Jack Dongarra
+ PDF Chat	Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs	1988	Jack Dongarra Jeremy Du Croz Sven Hammarling Richard Hanson