parallel.g​pu.CUDAKer​nel() not working

6 views (last 30 days)
King Fisher
King Fisher on 27 Nov 2011
I have written the following .cu code:
__global__ void addToVector(float * pi, float c, int vecLen) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if ( idx < vecLen ) {
pi[idx] += c;
}
}
after then I compiled it with :
nvcc -ptx code.cu //which generates a code.ptx file
The problem is that when I use above files to make kernel object, by
k = parallel.gpu.CUDAKernel('code.ptx','code.cu')
then following message occurs:
??? Error using ==> iParseToken at 266
Unsupported type in argument specification "fahad.cu".
Error in ==>
/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192
Error in ==>
/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79
Note that there is no problem with hardware or drivers. gpuarray command works fine.
Would anybody help me in this regard !

Answers (1)

Edric Ellis
Edric Ellis on 28 Nov 2011
CUDAKernel is very restrictive about precisely which types of argument it accepts; in particular, it only accepts basic "C" types (and const/pointer variants). Do you have any other _global_ entry points in the .cu file that you compiled to .ptx?
  2 Comments
King Fisher
King Fisher on 29 Nov 2011
I have following simple .cu code:
__global__ void addToVector(float * pi, float c, int vecLen) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if ( idx < vecLen ) {
pi[idx] += c;
}
}
And after compiling with command nvcc -ptx code.cu, I got the following .ptx code:
.version 1.4
.target sm_10, map_f64_to_f32
// compiled with /usr/local/cuda/open64/lib//be
// nvopencc 4.0 built on 2011-05-12
//-----------------------------------------------------------
// Compiling /tmp/tmpxft_00006747_00000000-9_fahad.cpp3.i (/tmp/ccBI#.8BZJzq)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:64
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 "<command-line>"
.file 2 "/tmp/tmpxft_00006747_00000000-8_fahad.cudafe2.gpu"
.file 3 "/usr/lib/gcc/x86_64-redhat-linux/4.4.5/include/stddef.h"
.file 4 "/usr/local/cuda/bin/../include/crt/device_runtime.h"
.file 5 "/usr/local/cuda/bin/../include/host_defines.h"
.file 6 "/usr/local/cuda/bin/../include/builtin_types.h"
.file 7 "/usr/local/cuda/bin/../include/device_types.h"
.file 8 "/usr/local/cuda/bin/../include/driver_types.h"
.file 9 "/usr/local/cuda/bin/../include/surface_types.h"
.file 10 "/usr/local/cuda/bin/../include/texture_types.h"
.file 11 "/usr/local/cuda/bin/../include/vector_types.h"
.file 12 "/usr/local/cuda/bin/../include/device_launch_parameters.h"
.file 13 "/usr/local/cuda/bin/../include/crt/storage_class.h"
.file 14 "/usr/include/bits/types.h"
.file 15 "/usr/include/time.h"
.file 16 "fahad.cu"
.file 17 "/usr/local/cuda/bin/../include/common_functions.h"
.file 18 "/usr/local/cuda/bin/../include/math_functions.h"
.file 19 "/usr/local/cuda/bin/../include/math_constants.h"
.file 20 "/usr/local/cuda/bin/../include/device_functions.h"
.file 21 "/usr/local/cuda/bin/../include/sm_11_atomic_functions.h"
.file 22 "/usr/local/cuda/bin/../include/sm_12_atomic_functions.h"
.file 23 "/usr/local/cuda/bin/../include/sm_13_double_functions.h"
.file 24 "/usr/local/cuda/bin/../include/sm_20_atomic_functions.h"
.file 25 "/usr/local/cuda/bin/../include/sm_20_intrinsics.h"
.file 26 "/usr/local/cuda/bin/../include/surface_functions.h"
.file 27 "/usr/local/cuda/bin/../include/texture_fetch_functions.h"
.file 28 "/usr/local/cuda/bin/../include/math_functions_dbl_ptx1.h"
.entry _Z11addToVectorPffi (
.param .u64 __cudaparm__Z11addToVectorPffi_pi,
.param .f32 __cudaparm__Z11addToVectorPffi_c,
.param .s32 __cudaparm__Z11addToVectorPffi_vecLen)
{
.reg .u16 %rh<4>;
.reg .u32 %r<6>;
.reg .u64 %rd<6>;
.reg .f32 %f<5>;
.reg .pred %p<3>;
.loc 16 1 0
$LDWbegin__Z11addToVectorPffi:
mov.u16 %rh1, %ctaid.x;
mov.u16 %rh2, %ntid.x;
mul.wide.u16 %r1, %rh1, %rh2;
cvt.u32.u16 %r2, %tid.x;
add.u32 %r3, %r2, %r1;
ld.param.s32 %r4, [__cudaparm__Z11addToVectorPffi_vecLen];
setp.le.s32 %p1, %r4, %r3;
@%p1 bra $Lt_0_1026;
.loc 16 4 0
ld.param.u64 %rd1, [__cudaparm__Z11addToVectorPffi_pi];
cvt.s64.s32 %rd2, %r3;
mul.wide.s32 %rd3, %r3, 4;
add.u64 %rd4, %rd1, %rd3;
ld.global.f32 %f1, [%rd4+0];
ld.param.f32 %f2, [__cudaparm__Z11addToVectorPffi_c];
add.f32 %f3, %f1, %f2;
st.global.f32 [%rd4+0], %f3;
$Lt_0_1026:
.loc 16 6 0
exit;
$LDWend__Z11addToVectorPffi:
} // _Z11addToVectorPffi
And the error Which I get :
k = parallel.gpu.CUDAKernel('code.ptx','code.cu')
then following message occurs:
??? Error using ==> iParseToken at 266
Unsupported type in argument specification "fahad.cu".
Error in ==>
/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192
Error in ==>
/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79
Edric Ellis
Edric Ellis on 29 Nov 2011
I'm not sure, but it looks like you're using a CUDA-4 compiler and R2011a. R2011a shipped with the CUDA 3.2 toolkit, so you might have better luck using that version of NVCC.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!