Compiler-Based Data Prefetching and Streaming Non-temporal
Store Generation for the Intel(R) Xeon Phi(TM) Coprocessor
R. Krishnaiyer, E. Kultursay, P. Chawla, S. Preis, A. Zvezdin, and H. Saito. Proceedings of the 2013 IEEE 27th International Symposium on
Parallel and Distributed Processing Workshops and PhD Forum, page 1575--1586. Washington, DC, USA, IEEE Computer Society, (2013)