Finding average at each point

I have a data say: 1 2 3 4 5 6 7 8
(8x1 vector). I want to find the average of each number using the 2 other numbers(one below and one above) and divide by 3. And for the first and the last numbers the average will be done by taking the average of the number with next below and the one above respectively and divide by 2. I want to write a script that can do this for this data and can also be used for large data (say 1000x1 vector). Thanks in advance

 Accepted Answer

vector=rand(1000,1);
halfWindow=1; % 1 since you said one below and one above. You can change this
s=ones(2*halfWindow+1,1)/(2*halfWindow+1);
vector_average=conv(vector,s,'valid');
vector_average=[sum(vector(1:1+halfWindow,1))./(halfWindow+1); ...
vector_average(:); ...
sum(vector(end-halfWindow:end,1))./(halfWindow+1)];

7 Comments

Thanks so much for the answer. Really helpful. I will like if you can explain what each line will carry out just as you did for line 2.
s=ones(2*halfWindow+1,1)/(2*halfWindow+1);
creates your averaging filter (if halfWindow=1 it gives you [1/3 1/3 1/3]
vector_average=conv(vector,s,'valid');
convolves the average filter on your data. so 1/3 of one below, 1/3 of one above and 1/3 of the current number. 'valid' just gives you the part of the convolution that all numbers exists. As you pointed at the beginning there is no one above or at the end there is no one below. So convolution assumes zero in those cases by default. 'Valid' keyword tells conv that don't return those numbers
the last part pads the average to fix for the first and last. (by the way, this last part works only for halfWindow=1;
sum(vector(1:1+halfWindow,1))./(halfWindow+1);
gives the average of the first two numbers
and
sum(vector(end-halfWindow:end,1))./(halfWindow+1)];
gives the average at the end and
vector_average=[sum(vector(1:1+halfWindow,1))./(halfWindow+1); ...
vector_average(:); ...
sum(vector(end-halfWindow:end,1))./(halfWindow+1)];
adds these two numbers to the previously calculated average to get the whole thing.
Thanks, Mohammad. However, I have a large vector(1000*1) and I intend to use large ''halfWindow'' say 5(5 above , 5 below). That implies that the first element will be itself plus 5 elements below divided by 6, then the second element will be itself plus first element above it and 5 elements below it divided by 7.Until I get to the 6th element before I can easily use 'conv' which then take the average of the 5 elements above and below and the 6th element itself and divide by 11. That also apply to the last 5 elements at the end as well. Is there any advise on how to go about the coding? I really hope you understand my question. Thanks
n=1000;
vector=rand(n,1);
halfWindow=5;
s=ones(2*halfWindow+1,1)/(2*halfWindow+1);
vector_average=conv(vector,s,'valid');
firstBatch=zeros(halfWindow,1);
lastBatch=zeros(halfWindow,1);
for i=1:halfWindow
firstBatch(i)=sum(vector(1:i+halfWindow,1))./(i+halfWindow);
lastBatch(end-i+1,1)=sum(vector(end-i-halfWindow+1:end,1))./(i+halfWindow);
end
vector_average=[firstBatch; ...
vector_average(:); ...
lastBatch];
dpb
dpb on 8 Oct 2014
Edited: dpb on 8 Oct 2014
@Mohammed--That's about as good as it gets for the OP's definition of the end effects--the question is why one would use that for a definition--it's pretty unique.
@Mayowa-- Why wouldn't one include the higher points in the initial point, though?
I suggest you look at the results of
c=conv(v,ones(1,N)/N,'same');
and if don't want the first few then take something like
a=c(fix(hW/2):end-fix(hW/2));
or something similar. It won't be identical to the computation above but I'd wager it'll be close enough and it's a much less convoluted (so to speak :) ) implementation.
Mohammad Abouali
Mohammad Abouali on 9 Oct 2014
Edited: Mohammad Abouali on 9 Oct 2014
if we use the 'same' in convolution depending on the window size couple items in the beginning and in the end are going to be averaged with zero.
Let's say you have temperature
300, 301, 302, ...
If the halfWindow is 1 then the first entry would be calculated like this
[0 300 301].*[1/3 1/3 1/3]=200.333
So the average goes down artificially.
One approach is to switch the window size at the end zones (like what we did in the above code) another approach is to use padarray function to kinda replicate the boundary nodes. The second one still can be questionable. (Depends on the problem and boundary conditions)
Yabbut... :) I just suggested to OP he investigate 'cuz likelihood is it won't make a lick of difference on what those 2-5 or so BC points are in a series that is several K in length. And, I suggested he just throw perhaps half of those away to minimize the amount of actual zero padding that does occur.
Of course, he can just use the 'valid' option and not have any end effects.... :)

Sign in to comment.

More Answers (3)

Stephen23
Stephen23 on 8 Oct 2014
Edited: Stephen23 on 8 Oct 2014
You could try convolution :
>> A = 1:8;
>> [mean(A(1:2)),conv(A,[1,1,1],'valid')/3,mean(A(end-1:end))]
The end conditions require some special handling (and is of dubious mathematical value), so the simpler solution is to only have an output where three values are averaged:
>> conv(A,[1,1,1],'valid')/3
Another advantage of this is that the length of your moving window can be adjustable (not hardcoded):
>> N = 4;
>> conv(A,ones(1,N),'valid')/N

2 Comments

Stephen23
Stephen23 on 8 Oct 2014
Edited: Stephen23 on 8 Oct 2014
Mayowa: a convolution is like a "moving window" multiply.
You can think of it like two vectors that are aligned together at one end, the corresponding elements are multiplied, and then the vectors shifted by one position relative to each other, and the process repeats until they are aligned at the other end. Dividing the resulting products by the length of the shorter "window" vector gives you the means of the original values of the longer vector.
The end values (which you defined as special cases) are calculated separately using the simple mean function.
except that you have to note that in convolution the second vector is flipped. for symmetric vectors (like in this case) it doesn't matter but if you are using non-symmetric kernel then that matters.
The one that does not flips the kernel is the cross correlation.

Sign in to comment.

What you're asking, except for the edges, is just a simple convolution by the vector [1/3 1/3 1/3]:
a = randi(100, 1, 20) %for example
b = conv(a, [1/3 1/3 1/3], 'same');
%now you can adjust for the edges:
b(1) = (a(1) + a(2)) / 2;
b(end) = (b(end) + b(end-1)) / 2

1 Comment

Thanks for the answer. In a situation where instead of 3, I have 9, that implies I have to adjust the first and last 4 elements without repeating the last 2 lines 4 times?

Sign in to comment.

dpb
dpb on 8 Oct 2014
Edited: dpb on 8 Oct 2014
>> [mean(v(1:2)) conv(v,[1 1 1]/3,'valid') mean(v(end-1:end))]
ans =
1.5000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000 7.5000
More general for N points is
a=[mean(v(1:N-1)) conv(v,ones(1,N)/N,'valid') mean(v(end-(N-1):end))];

2 Comments

Stephen23
Stephen23 on 8 Oct 2014
Edited: Stephen23 on 8 Oct 2014
This generates an error "Error: Unexpected MATLAB expression.", pointing at that space character. An extra bracket behind after the mean is required.
Although this still misses the requirement "...for the first and the last numbers the average will be..."
Missed the closing parens on the mean() -- the extension for end effect ought to be self-evident altho didn't write it, granted.
Edited Answer to correct oversight and typo...

Sign in to comment.

Tags

Asked:

on 8 Oct 2014

Commented:

dpb
on 9 Oct 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!