MATLAB Answers

Allocating pinned memory in matlab mex with CUDA

11 views (last 30 days)
Ander Biguri
Ander Biguri on 18 Feb 2019
Commented: Ander Biguri on 13 Aug 2019
I have an application where I call my own CUDA fucntions from a mex. However, the memory transferred can be very big (both input and output) and that means that pinned memory can help me speed up the process quite a lot.
I have seen several posts in teh internet and in hre mentioning that you can not use pinned memory (cudaMallocHost) with MTALAB variables, however all these are from 2017 or older. Now that we are in 2019, and the parallel computing toolbox, CUDA and MATLAB have changed a lot, is this still true? Can pinned memory not be used still? For applications where memory is critical this is a big drawback.
  6 Comments
Ander Biguri
Ander Biguri on 13 Aug 2019
Hi Matt,
You are absolutely right. In fact, a small test that I did not long ago showed that particularly for SART (which updates images projection by ptojection), an acceleration of x10 is expected if the memory trasnfer is removed and all the data is kept in the GPU.
For industrial/scientific sizes of images, SART would still be very slow and not recoomended. For medical images, this improvement may be very welcomed.
Now, about modifying TIGRE: it may be a challenging task.
Recenltly I updated TIGRE (https://arxiv.org/pdf/1905.03748.pdf) to work with multi-GPUs where the trasnfer to CPU may be required, as TIGRE now will break up the problem in chuncks if it does not fit the GPU, thus allowing for recosntruction bigger than before. Modifying this version is quite a huge workload as it would require quite big changes in the CUDA side, as there is a lot of memory management involved.
However, modifying the older single-GPU version will likely be considerably easier. Some changes in the CUDA code will be required (as its who passes memory in and out of the GPU), but there are just few lines to do the job. If you were to modify it to have dedicated gpuArrays and succeed, we could find a way to add it to the TIGRE code, and I could add some logic for when to use each of the versions (depending on problem size, number of GPUs, etc). If you are up for the task, please feel free to email me and we can discuss it further.
Ander

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!