Use coder to convert parfor-loop to mex, but mex version is much slower

4 views (last 30 days)
I convert a function containing a parfor-loop into a mex file to get some speed up. But the mex version of this function is 5x times slower than the matlab's version.
Before running, I used parpool(20) to setup a parallel pool mannully. When mex version of function is running, the CPU load is increased but the parallel workers are not working (the pool status in the left-bottom corner of matlab is blue instead of green) . Besides, the pool automatically shutdown when the mex function is still running.
It seems the parfor-loop after codegen does not work. Is there any way that I can check if these workers are running?
[EDIT]:
When I use mex-version of parfor, there is a matlab process that have a CPU usage of 4000%, but when I use the m-code version of parfor, there are 20 processes with CPU usage of 100%. It seems parfor after codegen uses multi-thread instead of multi-process. And this multi-thread is much slower than multi-process.
  2 Comments
Ji Lee
Ji Lee on 28 Apr 2021
Edited: Ji Lee on 28 Apr 2021
MATLAB Coder produces multi-threaded code using OpenMP that is independent of the parallel worker pool in MATLAB that your original code would use. As such, the status of the worker pool neither affects nor reflects the behavior of any MEX, C, and C++ code produced by MATLAB Coder.
That aside, see the bottom section of the document linked below that summarizes the restrictions governing successful use of parfor with code generation. Different restrictions amongst those mentioned within the document manifest differently--some trigger fallbacks to non-parallelized code, others may produce warnings, and yet others may error out at code generation time.
If you verify that none of the situations mentioned in the document apply to you, it would become helpful at that time to see an example (trimmed down or sanitized if appropriate) that demonstrates the nature of the code you are using with MATLAB Coder?
Xingwang Yong
Xingwang Yong on 29 Apr 2021
Edited: Xingwang Yong on 29 Apr 2021
Thanks Ji.
Before I post a question here, I've read the doc you mentioned and it does not help much.
Basically, I am trying to fit thousands of curves inside the parfor using lsqcurvefit(). I profiled the code before and after using coder.
Before, I used m-code of lsqcurvefit() and converted the objective function to mex. Fitting a curve would take 1.6s and nearly 1.2 seconds was used by objective function evaluation.
After, I used the codegen'd lsqcurvefit (the objective function is converted to mex automatically). The fitting take 2.6s and 1.4 seconds was used by objective evaluation. That's to say, 1.2s were used by other parts of lsqcurvefit(). This is far more than the m-code of lsqcurvefit.
From this initial experience, this slow-down may be atrributed to lsqcurvefit() instead of parfor. But I can not say it for sure. Because I tried to fit a simple exponential decay using codegen'd lsqcurvefit(), it indeed gives some speed-up. This slow-down may be related to lsqcurvefit() and the objective function (my objective function is highly sophiscated, which does not have an analytical formula and relies on numerical simulation).
I am trying to extract a minimal reproducible snippet from my code to demonstrate this issue. I would post it here if I managed to do that.

Sign in to comment.

Answers (0)

Categories

Find more on MATLAB Compiler in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!