if loop within for loop for statistical analysis of data

Question

Kosta on 21 Jan 2017

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/321354-if-loop-within-for-loop-for-statistical-analysis-of-data

Edited: Stephen23 on 21 Jan 2017

Accepted Answer: Stephen23

Open in MATLAB Online

Hi,

I am having a code with data, that consists of a very large column vector in the form of:

P_b=[2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;Nan];

For that vector, I would like to group all consecutive non-NaN values, i.e. [2;3;4;5;6],[3;4;5;6] etc. fit a normal distribution to them, extract the mean, and have the result come up in a vector. This vector includes all the means of the 'grouped' data of P_b.

May sound kind of complicated but it shouldn't be. I have created the code below, however an odd problem that arrises is that MATLAB does not recognise the variable 'avg', when at the end of the for-loop, I am trying to save all for-loop results in a vector. However when I run the code without that last line, it seems to recognise the variable 'avg'. Any ideas? Thanks in advance for your help. Below is the code.

P_pdf=[];
%Inices with NaN
idxnan=find(isnan(P_b));
for i=1:size(idxnan,1)-1
%Indices of numeric values
idxlow=idxnan(i)+1;
idxup=idxnan(i+1)-1;
%Group P_b Matrices according to NaN values
P_mat=P_b(idxlow:idxup);  
    %Reject empty matrices and treat singular values
    if size(P_mat)==[1,1];
    avg=P_mat;
    elseif size(P_mat)==[0,0];
    avg=NaN;
    %Create distribution fit 
    pdf=fitdist(P_mat,'Normal');
    avg=pdf.mu;
    end    
P_pdf=[P_pdf;avg];
end

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Stephen23 on 21 Jan 2017

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/321354-if-loop-within-for-loop-for-statistical-analysis-of-data#answer_251418

Edited: Stephen23 on 21 Jan 2017

Open in MATLAB Online

This is a classic example of how badly formatted code makes buggy code. When the code is formatted using MATLAB's default formatting rules (select all, ctrl+i), then the cause is much easier to spot:

P_pdf = [];
%Inices with NaN
idxnan = find(isnan(P_b));
for i = 1:size(idxnan, 1) - 1
    %Indices of numeric values
    idxlow = idxnan(i) + 1;
    idxup = idxnan(i + 1) - 1;
    %Group P_b Matrices according to NaN values
    P_mat = P_b(idxlow:idxup);
    %
    %Reject empty matrices and treat singular values
    if size(P_mat) == [1, 1];
        avg = P_mat;
    elseif size(P_mat) == [0, 0];
        avg = NaN;
        %Create distribution fit
        pdf = fitdist(P_mat, 'Normal');
        avg = pdf.mu;
    end
    P_pdf = [P_pdf; avg];
end

Now it is clear that there is an if and an elseif, but if neither of these conditions have been fulfilled then there is no else and so avg never gets defined. The error is due to testing the matrix size like this:

size(P_mat) == [0, 0]

which is not every going to be true when P_mat is created by indexing like this:

P_mat = P_b(idxlow:idxup);

Try it yourself at home:

 >> V = 1:3;
 >> size(V(2:1))
 ans =
    1   0

So that test ==[0, 0] will always fail. The logic is bad anyway: surely you want to test for non-empty vectors and apply the fit to them?

Here is a slightly more robust version of your loop:

P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
    tmp = P_b(idb(k):ide(k));
    pdf = fitdist(tmp,'Normal'); % untested, I don't have fitdist
    out(k) = pdf.mu;             % untested
end

Personally I would not write all of that code: I would simply split the input vector using accumarray, and then use cellfun to do whatever processing:

P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idx = isnan(P_b);
idy = cumsum([1;diff(idx)>0]);
C = accumarray(idy(~idx),P_b(~idx),[],@(n){n});
D = cellfun(@(v)fitdist(v,'Normal'),C); % untested: I don't have fitdist
P_pdf = arrayfun(@(s)s.mu,D)            % untested

It might be required to get cellfun to return a cell array:

D = cellfun(@(v)fitdist(v,'Normal'),C,'Uni',0); % untested
P_pdf = cellfun(@(s)s.mu,D)                     % untested

2 Comments
Show NoneHide None

Kosta on 21 Jan 2017

Thanks, this stupid mistake I made does indeed solve part of the problem. However I still can't get this to work. The P_mat does not seem to be treated every time by the if statement for some reason, resulting to a blank P_mat.

Stephen23 on 21 Jan 2017

Edited: Stephen23 on 21 Jan 2017

Open in MATLAB Online

Check how large the selection is like this:

P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
    tmp = P_b(idb(k):ide(k));
    if isempty(tmp)
        out(k) = NaN;
    elseif isscalar(tmp)
        out(k) = tmp;
    else
        pdf = fitdist(tmp,'Normal');
        out(k) = pdf.mu;
    end
end

Sign in to comment.

Answer 2

Kosta on 21 Jan 2017

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/321354-if-loop-within-for-loop-for-statistical-analysis-of-data#answer_251419

Open in MATLAB Online

Got this whole thing working like this finally. Thanks again for your help:

P_pdf=[];
 %Inices with NaN
idxnan=find(isnan(P_b));
 for i=1:size(idxnan,1)-1
    %Indices of numeric values
    idxlow=idxnan(i)+1;
    idxup=idxnan(i+1)-1;
    %Group Power Matrices according to NaN values
    P_mat=P_b(idxlow:idxup);
    %Reject empty matrices and treat singular values
  if size(P_mat)==[1,1];
    avg=P_mat;
  elseif size(P_mat)==size(zeros(0,1));
    avg=NaN;
  else
    %Create distribution fit 
    pdf=fitdist(P_mat,'Normal');
    avg=pdf.mu;
  end    
    P_pdf=[P_pdf;P_mat];
end

1 Comment
Show -1 older commentsHide -1 older comments

Stephen23 on 21 Jan 2017

Edited: Stephen23 on 21 Jan 2017

Open in MATLAB Online

Note that this code is not robust (e.g. it cannot cope with sequential NaN), nor efficient due to the concatenation inside the loop. In particular this is very poor code:

size(P_mat)==size(zeros(0,1))

Hard to read, hard to comprehend, and pointlessly complicated. See my answer and comments for much simpler code.

Sign in to comment.

if loop within for loop for statistical analysis of data

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

if loop within for loop for statistical analysis of data

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

1 Comment
Show -1 older commentsHide -1 older comments