An example, pulled from the " CUDA by Example by J. The CUDA runtime model allows two block dimensions and two thread dimensions. My question is, "What do the dimensions of the last argument to CUDAFunctionLoad mean, and what does the optional last argument to a CUDAFunction mean, and how does one use the total dimensionality of 5 that is permissible in CUDA?" gridDim variable consist number of thread blocks in each dimension of a grid. Note that any missing dimensions in the constructor are assumed to be 1. blockDim variable consist number of threads. The CUDA data type dim3 is used to define the number of threads in our block, For the GOL kernel we specify a two dimensional block size to better suite our problems geometry, for copying a simple single dimensional size is best.
Launch the kernel (<<<, > are CUDA runtime. dim3 grid ( 2, 2 ) // number of blocks dim3 block ( 8, 2 ) // threads per block hellocuda <<![cuda dim3 block cuda dim3 block](http://www.geeks3d.com/public/jegx/200906/cupp.jpg)
There is now a follow on question here: A simple experiment to understand CUDAFunctionLoad DimBlock(4,8,8) // 256 threads per (3D) block sizet SharedMemBytes 64 // 64 bytes of shared memory.
kernel <<< Grid, Block> (.) Grid: dimension and size of grid (of blocks).I have a related question/request here: Looking for a working mathematica CUDA port of NVIDIA's nbody.cu. // define Grid, Block Note unspecified dim3 field initializes to 1.
![cuda dim3 block cuda dim3 block](https://face2ai.com/CUDA-F-2-3-%E7%BB%84%E7%BB%87%E5%B9%B6%E8%A1%8C%E7%BA%BF%E7%A8%8B/2_2.png)
The purpose of the question is to understand how Mathematica is interfacing with CUDA's architecture. This is a follow up question to: CUDA: setting grid dimensions.