Anatomy of High-performance Matrix Multiplication

Zusammenfassung

We present the basic principles that underlie the high-performance implementation of the matrix-matrix multiplication that is part of the widely used GotoBLAS library. Design decisions are justified by successively refining a model of architectures with multilevel memories. A simple but effective algorithm for executing this operation results. Implementations on a broad selection of architectures are shown to achieve near-peak performance.

BibTeX-Schlüssel: Goto:2008:AHM:1356052.1356053
Eintragstyp: article
Adresse: New York, NY, USA
Jahr: 2008
Monat: may
Zeitschrift: ACM Trans. Math. Softw.
Nummer: 3
Seiten: 12:1--12:25
Verlag: ACM
Band: 34
acmid: 1356053
numpages: 25
articleno: 12
issn: 0098-3500
issue_date: May 2008
DOI: 10.1145/1356052.1356053
URL: http://doi.acm.org/10.1145/1356052.1356053

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

BibSonomy

Anatomy of High-performance Matrix Multiplication

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf