Marvin is a deep learning framework designed first and foremost to be hackable. It is naively simple for fast prototyping, uses only basic C/C++, and only calls CUDA and cuDNN as dependencies.
This first post in a series on CUDA C and C++ covers the basic concepts of parallel programming on the CUDA platform with C/C++.
int i = blockDim.x * blockIdx.x + threadIdx.x
A. Dallmann, P. Beck, and J. von Gudenberg. Parallel Processing and Applied Mathematics, volume 8385 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2014)