How to aggregate rows in a matrix

If i have a matrix like the following example (or please see attached file)
2008,5,4,9,40,0,0.0,0.0,0.0,15.9
2008,5,4,9,41,0,0.0,0.0,1.0,16.3
2008,8,5,16,20,0,0.0,0.0,1.0,2.9
2008,8,5,16,21,0,0.0,0.0,1.0,2.9
and so on for a full year. The matrix represents year, month, date of month, hour, minute, zone1,zone2,zone3 and zone4 How can i create a program that takes an hour for example and aggregates zone1,zone2zone3 and zone4 for that full choose hour ? I have to end up with two vectors. One with year,month,date of month, hour, minute and one with the aggregated zone1,zone2,zone3,zone4 The original matrix consists of data for a Whole year. I was think of something with Unique but then i am not sure how i can aggregate the data!
I have use for example [day, ~, subs] = unique(m(:, 1:3), 'rows') Then i get all the days. Then i can use sumdayzone1 = accumarray(subs, m(:, 7), [], @sum) but that only gives me the sum of zone1 on the choosen day. How can i do all zone together so i end up with only one vector to the corrensponding day ? I am a total newbie in this so any kind of help would be nice, thanks

1 Comment

I can't find the file... can you attach it again, please?

Sign in to comment.

 Accepted Answer

Guillaume
Guillaume on 2 Dec 2014
Edited: Guillaume on 4 Dec 2014
accumarray only operates on vectors, so you have no choice but to call it once per column:
[day, ~, subs] = unique(m(:, 1:3), 'rows');
colzones = [7 8 9 10];
sumdayzones = cell2mat(arrayfun(@(col) accumarray(subs, m(:, col), [], @sum), colzones, 'UniformOutput', false));
or with a loop
[day, ~, subs] = unique(m(:, 1:3), 'rows');
colzones = [7 8 9 10];
sumdayzones = zeros(size(day, 1), numel(colzones));
for colidx = 1:numel(colzones)
sumdayzones(:, colidx) = accumarray(subs, m(:, colzones(colidx)), [], @sum);
end

8 Comments

When i run both codes i get different error messages!
Error using arrayfun All of the input arguments must be of the same size and shape. Previous inputs had size 4 in dimension 2. Input #3 has size 12.
In an assignment A(I) = B, the number of elements in B and I must be the same.
can you please help me explain what is wrong !
Sorry about that!
In the first case, I made a typo in 'UniformOutput'.
In the 2nd case, I made two mistakes (and a typo you must have spotted). I used the wrong size to initialise sumdayzones and I forgot a colon in the assignment in the loop.
I've fixed everything now and also improved the first case a bit.
This Works great. Could you be chance also tell me if it is possible to get the full vector with the starting period included in the result ? Or how to create a new vector with all the starting periods. I mean the first array used in the calculating day 1 day 2 day 3 etc..
I have one more question regards this function. I also need it to make an hour of the day average. Meaning that i need the average of lets say every from 14-15 for every day of the year. and for all 24 hours of the day. I need an output vector of 24*4 and another vector that indicated the beginning the periods. I am such a newbie here.
Hope you can help me :o)
To get the starting period (the day), just concatenate it with the sums:
sumdayzones = [day sumdayzones];
To get the average per hour (or month, or whatever) you just need to change the criteria you pass to unique. For hours, you just want to get the unique hours, so it's:
[hour, ~, subs] = unique(m(:, 4), 'rows'); %there's only one column so 'rows' is now optional
I don't get that with concatenate it with the sums, sorry If i place it in the function the vector get 20 columns and it just looks really strange.
day returned by:
[day, ~, subs] = unique(m(:, 1:3), 'rows');
is only three columns. sumdayzones is only 4, therefore the concatenation is only 7 columns.

Sign in to comment.

More Answers (2)

[day, ~, subs] = unique(m(:, 1:3), 'rows');
zones_colind = [7:10];
for i = 1:size(day, 1)
m_day(i,:) = [day(i, :) sum(m(find(subs == i), zones_colind))];
end

4 Comments

Could you tell me how i could display all the first six arrays in the result and not only 1:3 as choose or 1:2 ?
Thorsten
Thorsten on 4 Dec 2014
Edited: Thorsten on 4 Dec 2014
The first six columns are year, month, day, hour, minutes, seconds. If you aggregate the data for a day, why do you want to display hour, minutes, seconds? That does not make sense. Or what do you want to achieve?
It might not make sense but that is what i have to do. Get to vectors one with the aggregated results and one vector with the starting point of the aggregation which is the first six columns
Like this?
[day, ~, subs] = unique(m(:, 1:3), 'rows');
zones_colind = [7:10];
for i = 1:size(day, 1)
ind = find(subs == i);
m_day(i,:) = [day(i, :) m(ind(1), 4:6) sum(m(ind, zones_colind))];
end

Sign in to comment.

N00TN00t
N00TN00t on 19 Jun 2017
Edited: N00TN00t on 19 Jun 2017
What if we want to aggregate the time vector, not the zones?
So for aggregating for the months, we want a matrix showing columns [1:6] of the original matrix, but each row showing only one month:
2008,1,0,0,0,0
2008,2,0,0,0,0
2008,3,0,0,0,0
2008,4,0,0,0,0
2008,5,0,0,0,0
2008,6,0,0,0,0
2008,7,0,0,0,0
2008,8,0,0,0,0
2008,9,0,0,0,0
2008,10,0,0,0,0
2008,11,0,0,0,0
2008,12,0,0,0,0

Asked:

on 2 Dec 2014

Edited:

on 19 Jun 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!