How to calculate the median of a column depending on the value of another column?

INTRODUCTION: I have two columns of values. The values of the first column are partially constant and the values of the second column are arbitrary ones.
GOAL: I want to build a third column with values of median for each group of constant value of the first column.
EXAMPLE:
A=[1 3;
1 2;
1 3;
2 4;
2 4;
2 3;
2 4;
3 5;
3 1;
3 1;
3 1;
3 2;
4 3;
4 2];
B1=median(A(1:3,2));
B2=median(A(4:7, 2));
B3=median(A(8:12, 2));
B4=median(A(13:14, 2));
B=[B1 B2 B3 B4]';
PROBLEM: The number m of rows are typically much larger than only 14 and makes impossible to write the commands B1 until BN per hand.
I wonder if someone could tell me how to write some command lines that makes this automatically.
Thank you in advance for your help
Emerson

 Accepted Answer

B = accumarray(A(:,1),A(:,2),[],@median);
out = [A, B(A(:,1))];

2 Comments

Hi Andrei, I have a problem:
The values of the first column may be 1.5 instead of 1 and so on for the others rows. I this case I obtain the following error:
Error using accumarray First input SUBS must contain positive integer subscripts.
Do you know what to change in the command to make it more general?
Thank you
Emerson
yes,
[c,c,c] = unique(A(:,1));
B = accumarray(c,A(:,2),[],@median);
out = [A, B(A(:,1))];

Sign in to comment.

More Answers (1)

B=squeeze(median(reshape(A(:,2),3,1,size(A,1)/3)))
%A must contains a multiple of 3 rows, if not, we have to complete with nan or zero values, write at the begening this code
nc=mod(size(A,1),3);
if nc>0;
A=[A;nan(3-nc,2)]
end

5 Comments

if you want to complete with zeros use
zeros(3-nc,2)
instead of
nan(3-nc,2)
Thank you Azzi Abdelmalek,
unfortunately your suggestion is not doing what I want because the number of rows with constant values ARE NOT MULTIPLE of three.
The example that I gave displays the rows of the first column with:
1 three times, 2 four times, 3 five times and 4 two times. The number of rows with constant values are not multiple of any particular number.
I hope there is a way to modify your suggestion to do what I need.
Thank you again for your attention
Emerson
ok try this
[~,idx]=unique(A(:,1),'stable');
idx1=[idx [diff(idx)-1+idx(1:end-1); size(A,1)]];
for k=1:size(idx1,1);
B(k)=median(A(idx1(k,1):idx1(k,2),2));
end
B
Thank you Azzi,
your suggestion works now. I only don't understand what idx1 is doing.
I also get a red line for B(k)=..... with the comment:
The variable B appears to change size on every iteration (within a script). Consider preallocating for speed. I don't understand what that means.
Thank you again for your help
Emerson
for our example
idx1 =
1 3 1 is repeating from index 1 to 3
4 7 2 is repeating from index 4 to 7
8 12 3 is repeating from index 8 to 12
13 14 4 is repeating from index 13 to 14
B is changing a size because it's in the loop, B(1), B(2),... then B(k)
to make preallocation,( in case we work with big array)
B=zeros(1,numel(idx))

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!