CUDA Kernels and long vectors
4 views (last 30 days)
Show older comments
At the moment i'm writing my first programs with MATLAB and Cuda. I wrote a kernel in .cu file, compiled to ptx and execute it with a feval function.
I have a ThreadBlockSize of 1024 and a maximum gridsize of [65535 65535].
Everything works fine so far, but i have some troubles with the indexing in my kernel.
When I want to add vectors with length 10e7 i am not able to get the index in the kernels right.
In my m-File I set the my grid size like
nBlocks = ceil(nPoints / k_calc.ThreadBlockSize(1));
if nBlocks <= 65535
k_calc.GridSize = nBlocks;
k_calc.GridSize = [65535 ceil(nBlocks/65535)];
In my example with a vector with 10e7 elements, 10e7 is bigger than 65535*1024, so i have a gridsize of [65535 2].
In my cu-kernel I tried the index
int idx = blockDim.x * blockIdx.x + threadIdx.x;
but this is wrong for elements with index greater than 65535*1024. Which cuda-variable tells me in which row of my grid i am?
gridDim.x gives me only the Dimensions not the current location as far as i know.
Thank you very much, Raphael
Ben Tordoff on 26 Feb 2013
Edited: Ben Tordoff on 26 Feb 2013
If you want to go down the "x" dimension first, you probably want
int const globalBlockIdx = blockIdx.y * gridDim.x + blockIdx.x;
int const globalThreadIdx = globalBlockIdx * blockDim.x + threadIdx.x;
or something similar. This assumes your grid is only 2D and your blocks are only 1D.
Find more on Get Started with GPU Coder in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!