process based parpool: keep the data in the workers

2 views (last 30 days)
I have a routine that creates a large matrix in compressed format as a nested cell structure and subsequently performs a number of matrix-vector products with this matrix. I am trying to parallelize this routine but I'm running into problems. Since the routine calls several mex functions to create the matrix I cannot use a thread based pool. So I use parpool('local') and parfor, looping over the sub-blocks and letting the workers fill the corresponding cells. When this is done I close the pool and open a thread-based pool for the mutliplications, which do not need any mex functions.All this works well for moderate size problems (1 to 10 GB) but for larger problems the process increasingly slows down until the parallelization doesn't show any speed-up for problems of 150 GB. I have tested without parallelization and it isn't the filling of the matrix that is slowing things down, it looks like it's the workers sending their slices of the matrix back to the client. Is there any solution to this? Ideal would be to keep the slices in the workers, and let each of them do their part of the mat-vec multiplications with subsequent parfor calls, but I can't find a way to do this.
Thanks for any help

Accepted Answer

Edric Ellis
Edric Ellis on 22 Jul 2022
Mike has already suggested looking at parfeval. The other option, which may be appropriate for your problem is to use spmd. This is quite a different parallel programming model, and converting a program from for or parfor to spmd is decidedly non-trivial. However, the advantage of spmd is that it is explicitly designed for keeping data on workers.
Without more details of your particular application, it's hard to know exactly how things might go with spmd. Some things to consider though:
  • The for- drange construct lets you do something sort-of similar to a parfor loop inside your spmd block - it automatically divides up a global range into worker-specific pieces
  • The Composite data type lets you leave data on the workers after the end of the spmd block.

More Answers (2)

Mike Croucher
Mike Croucher on 21 Jul 2022
Parfor is great, it auto-parallelises for us and takes care of a lot of things like data transfer to/from workers and so on. At some point, however, we can find ourselves fighting against or being limited by its automatic choices.
Whenever this happens to me, I start looking at other constructs in MATLAB's parallel language. Could you recast it to use parfeval for example?
The function you'd run on parfeval might then do something like
  • create its part of the matrix
  • do the matrix-vector products as part of a for loop (No need for parfor), the parallelisation will come from running lots of these functions simulatenously
  • return only what is required
parfeval isn't suitable for all problem types but its often the first thing I reach for when I run out of steam with parfor.

alex heldring
alex heldring on 22 Jul 2022
Thank you very much both of you for your answers. As I was trying to follow up on Mike's idea, which possibly works but is not straightforward to implement, Edric's answer arrived. I decided to quickly try this out, simply by replacing parfor with spmd and looping over the spmd 'labindex' counter. And it worked rightaway!
Essentially I'm following the example in the parallel.pool.Constant entry of the matlab help-center, under the section 'Make parallel.pool.Constant from Composite'. So I'm using parfor after all, for the matrix-vector products, but everything stays on the workers, just like I wanted.
Thanks again for your help,

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!