Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an $m n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn łog(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.
%0 Journal Article
%1 halko2011finding
%A Halko, N.
%A Martinsson, P. G.
%A Tropp, J. A.
%D 2011
%I Society for Industrial & Applied Mathematics (SIAM)
%J SIAM Review
%K 15b52-random-matrices-algebraic-aspects 60b20-random-matrices-probabilistic-aspects 62-07-data-analysis 65f20-overdetermined-systems-pseudoinverses 65f30-other-matrix-algorithms 65f99-numerical-linear-algebra-none-of-the-above 65y05-parallel-computation 68w20-randomized-algorithms 68w30-symbolic-computation-algebraic-computation
%N 2
%P 217--288
%R 10.1137/090771806
%T Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
%U http://epubs.siam.org/doi/abs/10.1137/090771806
%V 53
%X Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an $m n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn łog(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.
@article{halko2011finding,
abstract = {
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.},
added-at = {2021-01-22T01:31:19.000+0100},
author = {Halko, N. and Martinsson, P. G. and Tropp, J. A.},
biburl = {https://www.bibsonomy.org/bibtex/2b0e624ceb3e8a43118e8363cd660efdd/gdmcbain},
doi = {10.1137/090771806},
interhash = {1f164bcda3922b6ecb769382941438ee},
intrahash = {b0e624ceb3e8a43118e8363cd660efdd},
journal = {{SIAM} Review},
keywords = {15b52-random-matrices-algebraic-aspects 60b20-random-matrices-probabilistic-aspects 62-07-data-analysis 65f20-overdetermined-systems-pseudoinverses 65f30-other-matrix-algorithms 65f99-numerical-linear-algebra-none-of-the-above 65y05-parallel-computation 68w20-randomized-algorithms 68w30-symbolic-computation-algebraic-computation},
month = jan,
number = 2,
pages = {217--288},
publisher = {Society for Industrial {\&} Applied Mathematics ({SIAM})},
timestamp = {2021-01-22T01:32:04.000+0100},
title = {Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions},
url = {http://epubs.siam.org/doi/abs/10.1137/090771806},
volume = 53,
year = 2011
}