Script using parfor spawns way too many threads

8 views (last 30 days)
Greetings,
I have a Matlab program whose main task is "embarrasingly parallel" - the inner loop that does the brunt of the work consists of entirely independent iterations. Thus, it is a perfect candidate for using parfor. So, depending on which machine I'm running on, I start a parpool, give it the number of physical cores as an argument, and indeed, it starts running the loop in parallel.
The problem is, if you monitor the thread usage, it spawns far more threads than the number of workers in the parpool, to the point that on certain machines, the OS eventually cuts Matlab off.
My best working theory is that something inside the loop must be inherently multithreaded, so that each iteration is itself creating some explosion of threads in the course of executing the loop iteration. But, even supposing that were true:
  • How would I track down which line or function call is spawning new threads?
  • If I can identify it, how does one stop Matlab from multithreading inside the parfor loop?
Thanks in advance for your assistance.
EDIT: I tried paring it down to the barest of essentials to narrow down the problem. Below is an example parfor loop from the parfor documentation. I've adjusted it to create 8 parallel workers. If I keep the Resource Monitor open and run it, I am seeing the workers each show up as their own Matlab process, but it is indicating each is using ~100 threads.
parpool(8)
parfor i=1:8, c(:,i) = eig(rand(1000)); end
  4 Comments
Edric Ellis
Edric Ellis on 30 Jun 2022
MATLAB processes make a distinction between "computational" threads and other threads. The setting NumThreads controls "computational" threads. This means that if you get a worker to perform a large matrix operation (for example), then it will do so using only a single thread. (Your desktop MATLAB will use multiple threads here). The other threads are a whole bunch of background threads, which mostly should not be performing significant amounts of work. These are present to orchestrate communication between client and workers, and a bunch of other stuff.
How many workers do you need to start before problems start arising? I would not expect running 8 workers on a single machine to cause problems, but maybe you are running more?
MATLAB worker processes consume memory. If you can use parpool("threads"), then this will avoid some of the extra memory and thread overhead. Not all MATLAB code is supported there though.
Matthew Lockner
Matthew Lockner on 2 Jul 2022
To answer the question of how many workers do I need to start before problems start arising, the answer is system-dependent. I have run the code on 32-core nodes where the job is granted the full node; as I recall, I was told I would have to use "ulimit -Su 20480" to ensure a good submission. I'm not an expert, but I gather the command increases the amount of threads the OS will grant before it won't grant anymore. In contrast, I have a cluster with nodes of 96 total cores - I find that I cannot start more than 16 processes reliably without this issue arising. On my personal 8-core PC, it seems to handle 8 processes just fine.
The fun part is that the code does scale with number of cores - not necessarily 1/N but significantly. I will try your suggestion of using parpool("threads") and see what effect that has. It seems I mostly just need to convince the OS to accept a flood of threads and it all will run fine.

Sign in to comment.

Answers (1)

Bruno Luong
Bruno Luong on 29 Jun 2022
It sounds count productive to cut off the low-level multi-thread and move it on you parfor loop.
I rather split the parfor into 2 nested loops: an inner parfor a block of fewer iterations, then outer standard for loop until all the index is done.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!