Developer at MathWorks

Professional Interests: parallel computing, numerical analysis, distributed arrays, GPU

Answered

GPU, Interpn and Meshgrid

If you look at the help for gpuArray/interpn in R2014a it shows an example of 4D interpolation. My theory is that your inputs t...

GPU, Interpn and Meshgrid

If you look at the help for gpuArray/interpn in R2014a it shows an example of 4D interpolation. My theory is that your inputs t...

6 years ago | 0

Answered

Can I have 8 workers on a quad-core processor?

When determining the default size of the local cluster, MATLAB uses the number of true cores you have (hyperthreading is ignored...

Can I have 8 workers on a quad-core processor?

When determining the default size of the local cluster, MATLAB uses the number of true cores you have (hyperthreading is ignored...

6 years ago | 5

| accepted

Answered

MATLAB R2014a in Ubuntu 14.04 doesn't recognize NVIDIA GeForce GTX770

Yes, it is because you are using the "nouveau" driver. You need to get a compute driver from the <http://www.nvidia.com/Downloa...

MATLAB R2014a in Ubuntu 14.04 doesn't recognize NVIDIA GeForce GTX770

Yes, it is because you are using the "nouveau" driver. You need to get a compute driver from the <http://www.nvidia.com/Downloa...

6 years ago | 0

| accepted

Answered

Parfor loop variable cannot be classified

The parfor variable classificiation is getting confused by the type of indexing you are performing with the microMovie variable....

Parfor loop variable cannot be classified

The parfor variable classificiation is getting confused by the type of indexing you are performing with the microMovie variable....

6 years ago | 2

| accepted

Answered

question regarding what is allowed in a parfor loop

This documentation is trying to explain when local variables defined in a parfor loop body are available for use outside the par...

question regarding what is allowed in a parfor loop

This documentation is trying to explain when local variables defined in a parfor loop body are available for use outside the par...

6 years ago | 0

| accepted

Answered

Parfor error with bsxfun

I think that the parfor analysis is getting a bit confused by the multidimensional indexing into var2. If you re-write the code...

Parfor error with bsxfun

I think that the parfor analysis is getting a bit confused by the multidimensional indexing into var2. If you re-write the code...

6 years ago | 1

| accepted

Answered

classification error: slicing struct for parfor loop

It might help to store parforloopData(i,1) in a temporary variable inside the loop, use the temporary variable everywhere, and t...

classification error: slicing struct for parfor loop

It might help to store parforloopData(i,1) in a temporary variable inside the loop, use the temporary variable everywhere, and t...

6 years ago | 1

| accepted

Answered

Hyperthreading & Number of Cores. Parallel computing toolbox

Under the Parallel Menu, choose "Manage Cluster Profiles...". This will pull up the Cluster Profile Manager window. Choose the...

Hyperthreading & Number of Cores. Parallel computing toolbox

Under the Parallel Menu, choose "Manage Cluster Profiles...". This will pull up the Cluster Profile Manager window. Choose the...

6 years ago | 10

| accepted

Answered

CPU vs GPU - Is it reasonable?

While I can't comment exactly on the CPU/GPU comparison for your specific setup, I can say that the general rule of thumb in GPU...

CPU vs GPU - Is it reasonable?

While I can't comment exactly on the CPU/GPU comparison for your specific setup, I can say that the general rule of thumb in GPU...

6 years ago | 3

| accepted

Answered

Parallelization of matrix definiton

Without more context for your problem I cannot be sure, but you may find using distributed arrays inside an spmd block to be mor...

Parallelization of matrix definiton

Without more context for your problem I cannot be sure, but you may find using distributed arrays inside an spmd block to be mor...

6 years ago | 1

Answered

Error using gpuArray/arrayfun ... Use of functional workspace is not supported.

As of the latest MATLAB release (R2013b) arrayfun on the GPU does not support anonymous functions that have been created using v...

Error using gpuArray/arrayfun ... Use of functional workspace is not supported.

As of the latest MATLAB release (R2013b) arrayfun on the GPU does not support anonymous functions that have been created using v...

6 years ago | 1

Answered

When calling the matlabpool to work in parallel, does each worker work in single-thread mode?

Workers are started up in single-thread mode in order to avoid over-subscribing the machine and negatively impacting performance...

When calling the matlabpool to work in parallel, does each worker work in single-thread mode?

Workers are started up in single-thread mode in order to avoid over-subscribing the machine and negatively impacting performance...

6 years ago | 1

| accepted

Answered

gpuArray colunwise opertations on matrix ?

It looks like you could set up all the data in one pass. You might try organizing your data such that the matrix to use for c...

gpuArray colunwise opertations on matrix ?

It looks like you could set up all the data in one pass. You might try organizing your data such that the matrix to use for c...

7 years ago | 0

Answered

Are parallel for loops available on GPUs ?

While there is no general capability to execute for loops directly on the GPU, if all of the operations inside the for loop are ...

Are parallel for loops available on GPUs ?

While there is no general capability to execute for loops directly on the GPU, if all of the operations inside the for loop are ...

7 years ago | 0

| accepted

Answered

Can matlab's parallel computing toolbox lu factorizes a large sparse matrix on GPU

The Parallel Computing Toolbox does not support sparse matrices on the GPU.

Can matlab's parallel computing toolbox lu factorizes a large sparse matrix on GPU

The Parallel Computing Toolbox does not support sparse matrices on the GPU.

7 years ago | 1

Answered

Matrix multiply slices of 3d Matricies

If you have MATLAB R2013b, you can use the new gpuArray pagefun function like so: C = pagefun(@mtimes, A, B);

Matrix multiply slices of 3d Matricies

If you have MATLAB R2013b, you can use the new gpuArray pagefun function like so: C = pagefun(@mtimes, A, B);

7 years ago | 1

| accepted

Answered

Memory leaks in simple MEX function

There is a known memory leak in R2013a only. If possible, I would apply the patch associated with external bug report 954239 to...

Memory leaks in simple MEX function

There is a known memory leak in R2013a only. If possible, I would apply the patch associated with external bug report 954239 to...

7 years ago | 2

Answered

Matrix operations with Parallel Computing Toolbox

A number of MATLAB functions are multithreaded, and matrix multiplication (*) is one of them. This is what you are seeing durin...

Matrix operations with Parallel Computing Toolbox

A number of MATLAB functions are multithreaded, and matrix multiplication (*) is one of them. This is what you are seeing durin...

7 years ago | 1

Answered

Benchmarking A\b on the GPU runs on CPU in parallel?

The CPU portion of this benchmark is multithreaded, so you should see more than one core working during the linear system solve....

Benchmarking A\b on the GPU runs on CPU in parallel?

The CPU portion of this benchmark is multithreaded, so you should see more than one core working during the linear system solve....

7 years ago | 0

Answered

Does copying from gpu to host generally take longer than copying from host to gpu?

I think you have arrived at the same conclusion as this blog post on GPU performance. It provides a lot of detail on how to pro...

Does copying from gpu to host generally take longer than copying from host to gpu?

I think you have arrived at the same conclusion as this blog post on GPU performance. It provides a lot of detail on how to pro...

7 years ago | 1

| accepted

Answered

why code using parallel processing has longer running time than the other?

There is some confusion here regarding distributed arrays, both in the initial post and in the answer by David Sanchez. Distrib...

why code using parallel processing has longer running time than the other?

There is some confusion here regarding distributed arrays, both in the initial post and in the answer by David Sanchez. Distrib...

7 years ago | 0

Answered

Can I run mex functions (containing no CUDA code) using arrayfun

You cannot run a Mex function using gpuArray/arrayfun. In fact, this is specified in the help for gpuArray/arrayfun: FUN...

Can I run mex functions (containing no CUDA code) using arrayfun

You cannot run a Mex function using gpuArray/arrayfun. In fact, this is specified in the help for gpuArray/arrayfun: FUN...

7 years ago | 1

Answered

why using gpu.Array.zeros I have error Undefined variable "parallel" or class "parallel.gpu.gpuArray.zeros" ?

What version of MATLAB are you using? In R2010b-R2012a releases, the GPU object was named parallel.gpu.GPUArray. In those ...

why using gpu.Array.zeros I have error Undefined variable "parallel" or class "parallel.gpu.gpuArray.zeros" ?

What version of MATLAB are you using? In R2010b-R2012a releases, the GPU object was named parallel.gpu.GPUArray. In those ...

7 years ago | 1

| accepted

Answered

1) Why for dense matrices it's not useful to split B? 2) What are the factors that determine the speed up for the sparse case?

I'll have to leave your second question for someone else to answer, as I am not sure of the details. As for your first question...

1) Why for dense matrices it's not useful to split B? 2) What are the factors that determine the speed up for the sparse case?

I'll have to leave your second question for someone else to answer, as I am not sure of the details. As for your first question...

7 years ago | 2

| accepted

Answered

Is it possible to perform parallel computing with lsqnonlin?

In the latest release (R2013a), lsqnonlin does not support parallel computations. The most up-to-date information on what opt...

Is it possible to perform parallel computing with lsqnonlin?

In the latest release (R2013a), lsqnonlin does not support parallel computations. The most up-to-date information on what opt...

7 years ago | 1

| accepted

Answered

Missing USEParallel Parameter from BLOCKPROC function in my matlab

The UseParallel option to blockproc was added in MATLAB version R2011b. That is why you are seeing an "unknown parameter" err...

Missing USEParallel Parameter from BLOCKPROC function in my matlab

The UseParallel option to blockproc was added in MATLAB version R2011b. That is why you are seeing an "unknown parameter" err...

7 years ago | 0

| accepted

Answered

Need help about how to use gpu in for loop

I do not believe that there is GPU support for the functions you have mentioned from the Signal Processing Toolbox (modulate, de...

Need help about how to use gpu in for loop

I do not believe that there is GPU support for the functions you have mentioned from the Signal Processing Toolbox (modulate, de...

7 years ago | 0

Answered

Matlab + CUDA slow in solving matrix-vector equation A*x=B

To get accurate timings for GPU calculations you need to be sure to wait for the GPU to finish. You should modify all your timi...

Matlab + CUDA slow in solving matrix-vector equation A*x=B

To get accurate timings for GPU calculations you need to be sure to wait for the GPU to finish. You should modify all your timi...

7 years ago | 1

| accepted

Answered

Codistributed arrays taking too long to run

Here is an example of how to properly benchmark an operation on distributed arrays: <http://www.mathworks.com/help/distcomp/e...

Codistributed arrays taking too long to run

Here is an example of how to properly benchmark an operation on distributed arrays: <http://www.mathworks.com/help/distcomp/e...

7 years ago | 0

| accepted

Answered

When to use codistributed arrays

Distributed arrays are most useful when you do not have enough memory to store an entire array on a single machine. By distribu...

When to use codistributed arrays

Distributed arrays are most useful when you do not have enough memory to store an entire array on a single machine. By distribu...

7 years ago | 0

| accepted