Author of the publication

Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels.

, , , and . PMBS@SC, page 1-6. IEEE, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages., , , , , and . CoRR, (2023)MD-Bench: A performance-focused prototyping harness for state-of-the-art short-range molecular dynamics algorithms., , , , , and . Future Gener. Comput. Syst., (December 2023)Execution-Cache-Memory modeling and performance tuning of sparse matrix-vector multiplication and Lattice quantum chromodynamics on A64FX., , , , , , and . Concurr. Comput. Pract. Exp., (2022)CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion., , , , and . CoRR, (2023)Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels., , , and . PMBS@SC, page 1-6. IEEE, (2019)Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams., , , , , , , and . IPDPS, page 402-412. IEEE, (2023)Efficient, out-of-memory sparse MTTKRP on massively parallel architectures., , , , , , , , and . ICS, page 26:1-26:13. ACM, (2022)ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX., , , , , , and . CoRR, (2021)ALTO: adaptive linearized storage of sparse tensors., , , , , , and . ICS, page 404-416. ACM, (2021)Reproducibility report: Team SegFAUlt @ SCC 2016., , and . Parallel Comput., (2017)