Parfor, variable is indexed in different way

Hi all,
I am converting my code to use parfor and I got an error variable is indexed in adifferent ways. My sniplet of code is shown below. I am not sure what is the problem with rry, it is just getting the index from ny.
Thank you.
ny = 100; %currently is a random number
rry = zeros(ny,1);
parfor ii = 1:n
if ii <= ny
diffy = abs(Ystar - ymean(ii));
[~,rry(ii)] = min(diffy);
end
ry = rry(mod(ii-1,ny)+1);
rhoheight = Ystar(ry);
end
Error: Unable to classify the variable 'rry' in the body of the parfor-loop. For more information, see Parallel for Loops in MATLAB, "Solve Variable Classification Issues in parfor-Loops".

 Accepted Answer

Each iteration of a parfor loop should be independent of every other. It is not obvious (neither for me, nor for Matlab) why this would be the case here. So let's run some tests:
ny = 100; %currently is a random number
rry = zeros(ny,1);
n=200; % you didn't define this
Ystar = rand(size(rry)); % or this
ymean = rand(1,n); % or this
rry_index=nan(1,n); % keep track of the indices used
for ii = 1:n
if ii <= ny
diffy = abs(Ystar - ymean(ii));
[~,rry(ii)] = min(diffy);
end
ry = rry(mod(ii-1,ny)+1);
rhoheight = Ystar(ry);
% Keep track of the indices, this may overwrite old indices
rry_index(ii)=mod(ii-1,ny)+1;
if rry_index(ii)~=ii
error('you need multiple elements of rry')
end
end
you need multiple elements of rry
if any(isnan(rry_index))
error('iterations are not independent')
end
So, if n>ny, you will need multiple elements of rry, otherwise you can simply use ii as the index. You might consider assigning the elements of rry with a for-loop, and running the remainder of the iterations in a parfor-loops (i.e. ii=(ny+1):n).

10 Comments

Hi Rik,
Thanks for your reply. After doucle checking the index of rry, I am sure it is changing when ii is changing. However, the output of rry(mod(ii-1,ny)+1) is consistent for ny step that mean the value of ry is the same in the horizontal direction (ymean is a m x n matrix and reshape to 1D).
Do you mind to write a sniplet of code which running the remainder of the iterations in a parfor-loops (i.e. ii=(ny+1):n).
Thank you.
Best,
Lam
Apparently I'm not making my point very clearly. Let's convert the contents of your loop to a function:
function [diffy,ry,rhoheight,rry]=LoopContent(ii,ny,Ystar,ymean,ryy)
if ii <= ny
diffy = abs(Ystar - ymean(ii));
[~,rry(ii)] = min(diffy);
end
ry = rry(mod(ii-1,ny)+1);
rhoheight = Ystar(ry);
end
This function has 1 variable that is both an input and an output: rry. This is not allowed in a parfor loop.
Luckily, rry only changes when ii<=ny, so we can still use a parfor loop if we first fill rry.
% First fill rry
for ii=1:ny
[diffy,ry,rhoheight,rry] = LoopContent(ii,ny,Ystar,ymean,ryy);
end
% Now run the remainder of the iterations in parallel.
parfor ii=(ny+1):n
[diffy,ry,rhoheight] = LoopContent(ii,ny,Ystar,ymean,ryy);
end
Since you don't show what you are actually doing with the output, I can't tell which variables are actually important to keep after the loop. Currently, the parfor loop doesn't do anything useful, since all variables are overwritten every iteration.
It doesn't help that you decided to remove all comments before you posted your code. Please leave them in next time. They are a great tool to make sure you explain what your code is doing and why. (and if you don't have comments yet: you will forget what this code does, so without comments you will have to redo everything when you look at this code again in 6 months)
Hi Rik,
Thank you for your fast reply. Below are the actual code of mine as the sniplet of code is creating confusion.
First I need to trace the index (ry) of a parcel to obtain rhoheight from the temperature field. The temperature field will be sorted in descending order is called rhosortedprofile. rystar is the index of the parcel after sorting.
I will need to calculate the area under the curve (rhostarprofile) with the limit of ry:rystar or rystar:ry. I am usage mod the get the remainder to map back the ry, since the value is the same for ny step.
The reason I wanted to use parfor is because when n>150000000, the code runs very slow. The file temp.mat can be downloaded.
I appreciate your helps.
Thank you,
Lam
load temp.mat
ny = 133;
rry = zeros(ny,1);
for k = 1:n <------- %I wanted to implement parfor here
rho = tmean(k); % selected parcel
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Find the index of the parcel after sorting
rystar = idxnew(k); % rystar is index of C_sorted(:,1) (sorted 1D Tmean array)
sortedrhoheight = Ystar(rystar); % height of the parcel in the sorted field
% Getting the rhoheight close to Ystar
if k <= ny
diffy = abs(Ystar - ymean(k));
[~,rry(k)] = min(diffy);
end
ry = rry(mod(k-1,ny)+1);
rhoheight = Ystar(ry);
rhobar = (1/(rhoheight-sortedrhoheight)) * sum(rhostarprofile(ry:rystar,1) .* C_sorted(ry:rystar,2));
end
n>150000000
Your code is writing up to rry(150000000) but then it is reading from rry(1:133) to get ry. But if the first ny iterations of the parfor have not been written to yet, then if parfor allowed the code you would be reading from zeros -- and the results of iterations would depend on the race of whether the iterations 1:ny have been done yet. This is non-deterministic, and not permitted by parfor.
@Rik shows how to deal with this by doing the first ny iterations in serial, so that the first ny entries will have been populated ahead of time -- and then to do the remaining iterations writing reading from the first 1:ny results and writing to a different variable to prevent conflicts.
This function has 1 variable that is both an input and an output: rry. This is not allowed in a parfor loop.
It is permitted to have a variable that is both input and output -- but there are strict limitations on indexing to prevent the possibility of reading or writing unexpected locations. The limitations are stricter than would be mathematically necessary to prevent problems. I gave an example earlier of reading and writing the same variable.
Hi Walter,
Thanks for your reply. I am doing it as @Rik suggestion. However, I encounter error Output argument "diffy" (and possibly others) not assigned a value in the execution with "untitled5>LoopContent" function.
I am still trying to figure it out but currently I have no clue what mistake I made.
Thank you.
You should change the signature to only the variables you actually want to keep, so perhaps this:
function rry=LoopContent(ii,ny,Ystar,ymean,ryy)
I don't see any variables you are keeping (besides rry). That is probably not what you want. Which variable do you want to keep? It has to be something you're indexing with k. It sounds as if you might want to keep rhobar, but you're overwriting it every iteration, so it's values are lost.
Hi Rik,
I need ry to get rhoheight
rhoheight = Ystar(ry);
and ry is determined from
ry = rry(mod(k-1,ny)+1);
and the index of rry is the one prevent parfor to work.
I have broken down the code for the n <=ny using serial mode and n > ny to parallel.
load temp.mat
ny = 133;
rry = zeros(ny,1);
for k = 1:n <------- %I wanted to implement parfor here
rho = tmean(k); % selected parcel
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Find the index of the parcel after sorting
rystar = idxnew(k); % rystar is index of C_sorted(:,1) (sorted 1D Tmean array)
sortedrhoheight = Ystar(rystar); % height of the parcel in the sorted field
% Getting the rhoheight close to Ystar
if k <= ny
diffy = abs(Ystar - ymean(k));
[~,rry(k)] = min(diffy);
end
ry = rry(mod(k-1,ny)+1);
rhoheight = Ystar(ry);
rhobar = (1/(rhoheight-sortedrhoheight)) * sum(rhostarprofile(ry:rystar,1) .* C_sorted(ry:rystar,2));
end
parfor k = (ny+1):n
k % Check if k is correct
rho = tmean(k); % selected fluid parcel
rystar = idxnew(k); % rystar = rystar1, is index of C_sort(:,1) (sorted 1D Tmean array)
sortedrhoheight = Ystar(rystar);
[RY,RH] = LoopContent(k,ny,Ystar,ymean,rry)
rhobar = (1/(RH-sortedrhoheight)) * -1 * sum(rhostarprofile(RY:rystar,1) .* C_sorted(RY:rystar,2));
ea(k) = (rhoheight - sortedrhoheight) * (rho - rhobar);
end
function [ry,rhoheight] = LoopContent(k,ny,Ystar,ymean,rry)
if k <= ny
diffy = abs(Ystar - ymean(k));
[~,rry(k)] = min(diffy);
end
ry = rry(mod(k-1,ny)+1);
rhoheight = Ystar(ry);
end
The code works, but the k is incorrect. It does not begin from 134 but other number.
Thank you.
The core feature of a parallel loop is that you are not guaranteed an execution order. The iterations are divided over the workers and the results are assembled when all workers are done. You attempt to determine what the first k is, which doesn't make sense in this context. The first k may or may not be ny+1. You are not promised an order, only an execution.
My only advice would be to pre-allocate ea.
And to write comments. Your current code is one big clump of code with variable names that are barely understandable. Do yourself a huge favor and write comments, especially when writing a function.
You should aim for half of your code to be green. You will not get there, but you should aim for it. When you do, you will see how that forces you to think about what your code is doing eacht step of the way.
K3iTH
K3iTH on 8 Aug 2023
Edited: K3iTH on 8 Aug 2023
Thanks @Rik. I have preallocation all the required variables but I did not show in the code above because there is a lot of variables. I simplified the code above so it uis easy to understant.
The code works flawlessly and runs about three times faster. Previsouly 260s, now 85s.
Thank you so much.
Best,
K3iTH
Glad to be of help.

Sign in to comment.

More Answers (1)

[~,rry(ii)] = min(diffy);
You are indexing rry by the expression ii that involves the parfor index variable.
ry = rry(mod(ii-1,ny)+1);
That tries to index rry at a different expression mod(ii-1,ny)+1 -- which is common code to be indexing into a circular buffer of length ny
rry = zeros(ny,1);
parfor ii = 1:n
if n is strictly <= ny, then the boundaries of the circular buffer will never be met, and the circular buffer indexing would be unneeded.
If n > ny then the boundaries of the circular buffer would be encountered, and you would be attempting to index into a different part of the circular buffer, to a location that might not even have been written into yet other than through the initial zeros. It would be a race condition as to which value would be read back -- if the high-valued iterations of for ii are done first then the circular buffer would end up talking about entries that are still zero, but if some of the lower-valued iterations of for ii are done then the circular buffer logic might pick up those results.
parfor recognizes the potential for uncertainty about what the mod index would do for you, and refuses to allow the code at all.
But really, parfor is pretty simple about the analysis, and what it really insists on is that if you index the same variable by expressions involving the parfor index variable, that the indexing must use exactly the same indexing expression. So for example you could have two places that referred to rry(5*ii-7) but you could not have one place that referred to rry(5*ii-7) and another that referred to rry(5*ii-6-i) even if mathematically you could prove that the values would be the same.
Because of this restriction, if you need to refer to more than one location in the same variable, then you will often end up needing to extract a subset of the variable into a temporary variable using a : index, then indexing the temporary variable -- and if you need to change multiple locations in the variable then write the changes into a temporary variable and afterwards assign the new values to the original variable using the same : index expression as was used to read the values. For example
thisrow = Data(ii,:);
thisrow(3) = 19; thisrow(11) = 22;
Data(ii,:) = thisrow;

Categories

Tags

Asked:

on 7 Aug 2023

Commented:

Rik
on 8 Aug 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!