Article,

Simulating cortical networks on heterogeneous multi-GPU systems

A. Nere, S. Franey, A. Hashmi, and M. Lipasti.
Journal of Parallel and Distributed Computing, (2012)Article in press.
DOI: 10.1016/j.jpdc.2012.02.006

Abstract

Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a GPGPU-accelerated implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application’s behavior and performance provides important insights into the GPGPU architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host CPU as well as multiple GPGPUs available to the system. Using the profiling tool with these optimizations on Nvidia’s CUDA framework, we achieve up to 60×~speedup over a single-threaded CPU implementation of the model.

BibTeX key: nere_simulating_2012
entry type: article
year: 2012
journal: Journal of Parallel and Distributed Computing
urldate: 2012-08-28
issn: 0743-7315
bdsk-url-1: http://www.sciencedirect.com/science/article/pii/S0743731512000408
bdsk-url-2: http://dx.doi.org/10.1016/j.jpdc.2012.02.006
DOI: 10.1016/j.jpdc.2012.02.006
url: http://www.sciencedirect.com/science/article/pii/S0743731512000408
note: Article in press

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{nere_simulating_2012, abstract = {Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a {GPGPU-accelerated} implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application{\textquoteright}s behavior and performance provides important insights into the {GPGPU} architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host {CPU} as well as multiple {GPGPUs} available to the system. Using the profiling tool with these optimizations on Nvidia{\textquoteright}s {CUDA} framework, we achieve up to 60{\texttimes}~speedup over a single-threaded {CPU} implementation of the model.}, added-at = {2014-01-19T06:56:06.000+0100}, author = {Nere, Andrew and Franey, Sean and Hashmi, Atif and Lipasti, Mikko}, bdsk-url-1 = {http://www.sciencedirect.com/science/article/pii/S0743731512000408}, bdsk-url-2 = {http://dx.doi.org/10.1016/j.jpdc.2012.02.006}, biburl = {https://www.bibsonomy.org/bibtex/21566acd6f5ca69c63fdbf146157845ba/neurokernel}, doi = {10.1016/j.jpdc.2012.02.006}, interhash = {7348a9130e73f963a04e6078d5f2544f}, intrahash = {1566acd6f5ca69c63fdbf146157845ba}, issn = {0743-7315}, journal = {Journal of Parallel and Distributed Computing}, keywords = {cuda gpu simulation}, note = {Article in press}, timestamp = {2014-01-19T06:56:06.000+0100}, title = {Simulating cortical networks on heterogeneous multi-{GPU} systems}, url = {http://www.sciencedirect.com/science/article/pii/S0743731512000408}, urldate = {2012-08-28}, year = 2012 }

BibSonomy

Simulating cortical networks on heterogeneous multi-GPU systems

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on