MATLAB Answers

Efficient moving average of scattered data

27 views (last 30 days)
Chad Greene
Chad Greene on 28 Jun 2016
Answered: Chris Turnes on 9 Mar 2017
I have some scattered data and I'd like to take something similar to a moving average, where I average all values with in some radius of each point. I can do this with a loop, but I'd like a more efficient approach. Any ideas?
Here's a working example I'd like to make more efficient:
x = randi(100,45,1) + 20+3*randn(45,1) ;
y = 15*sind(x) + randn(size(x)) + 3;
figure
plot(x,y,'bo')
radius = 10;
ymean = NaN(size(x));
for k = 1:length(x)
% Indicies of all points within specified radius:
ind = abs(x-x(k))<radius;
% Mean of y values within radius:
ymean(k) = mean(y(ind));
end
hold on
plot(x,ymean,'ks')
legend('scattered data','radial average','location','southeast')

  1 Comment

Walter Roberson
Walter Roberson on 28 Jun 2016
When I read the title I thought you might mean "sparse", and was thinking about how I might do an efficient moving average on sparse data.

Sign in to comment.

Accepted Answer

Chad Greene
Chad Greene on 30 Jun 2016
I turned this into a generalized function called scatstat1, which is on the file exchange here.

  1 Comment

Sign in to comment.

More Answers (2)

Chris Turnes
Chris Turnes on 9 Mar 2017
If you can upgrade to R2017a, this functionality can now be achieved through the 'SamplePoints' name-value pair in the moving statistics. For your example, you would do something like movmean(y, 2*radius, 'SamplePoints', x); (though you'd need to sort your x values first).

  0 Comments

Sign in to comment.


Walter Roberson
Walter Roberson on 28 Jun 2016
pdist() to get all of the distances simultaneously. Compare to the radius. Store the resulting mask. Multiply the mask by repmat() of the y value, and sum along a dimension. sum the mask along the same dimension and divide the value sum by that count. Result should be the moving average.

  3 Comments

Chad Greene
Chad Greene on 30 Jun 2016
Interesting idea! I got your solution working, but for N of 20,000 points the pdist function takes a bit of time. As it turns out, looping is a faster.
Walter Roberson
Walter Roberson on 30 Jun 2016
I wonder if looping pdist2() would be efficient? Eh, it probably just adds unnecessary overhead to a simple Euclidean calculation.
Chad Greene
Chad Greene on 1 Jul 2016
Also adds a Stats Toolbox dependency. I'll have to keep pdist in mind for future applications though. Thanks for the suggestion!

Sign in to comment.

Sign in to answer this question.

Products