Article,

Simulating cortical networks on heterogeneous multi-GPU systems

, , , and .
Journal of Parallel and Distributed Computing, (2012)Article in press.
DOI: 10.1016/j.jpdc.2012.02.006

Abstract

Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a GPGPU-accelerated implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application’s behavior and performance provides important insights into the GPGPU architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host CPU as well as multiple GPGPUs available to the system. Using the profiling tool with these optimizations on Nvidia’s CUDA framework, we achieve up to 60×~speedup over a single-threaded CPU implementation of the model.

Tags

Users

  • @neurokernel

Comments and Reviews