How do I interpolate 1D-data if I do not have unique values?

I have various size distributions in the same format with fixed size classes and sometimes I have multiple cummulative_share value of 0. Than I can not use interp1 to interpolate the size at 50% share. How may I fix that issue? Thanks for any help.
Example:
>> size_classes= [0.1 ; 0.2 ; 0.3 ; 0.4]
cummulative_share= [0 ; 0 ; 10; 100]
d50=interp1(cummulative_share, size_classes, 0.5)
size_classes =
0.1000
0.2000
0.3000
0.4000
cummulative_share =
0
0
10
100
Error using griddedInterpolant The grid vectors must contain unique points.
Error in interp1 (line 161) F = griddedInterpolant(X,V,method);

1 Comment

Thank you! Your method seems to work well for my case. Cheers.

Sign in to comment.

 Accepted Answer

The usual way to deal with that is to create an increasing vector of eps values and add it to the vector with the duplicates.
This assignment:
cummulative_share = cumsum(ones(size(cummulative_share)))*eps + cummulative_share;
will do exactly that.
Example:
size_classes= [0.1 ; 0.2 ; 0.3 ; 0.4];
cummulative_share= [0 ; 0 ; 10; 100];
cummulative_share = cumsum(ones(size(cummulative_share)))*eps + cummulative_share;
d50=interp1(cummulative_share, size_classes, 0.5)
d50 =
205.0000e-003

7 Comments

That solves it well. Thank you.
Now I have the issue again with multiple times the value 100 in the vector "cummulative_share". Is there a simple way to avoid that eps+100=100?
1*eps+100
ans =
100
My approach should work the same way with your repeated values of 100, although you may have to increase the offset value.
This appears to work:
v = [0; 0; 0; 100; 100; 100; 100] % Initial Repeated Vector
ve = cumsum(ones(size(v))).*v*eps % Scaled Offset For Non-Zero Elements
ve = ve + cumsum(ones(size(v))).*(v==0)*eps % Add Scaled Offset For Zero Elements
vi = v + ve % Interpolation Vector
test1 = vi - v % Check Result (Delete)
test2 = diff(vi) % Check Incremental Increase (Delete)
The idea is to add an ‘infinitesimal’ incremental offset. The constraint is that ‘infinitesimal’ has to be within the range of double-precision floating-point addition, and has to work for both zero and non-zero repeated values. It should not affect the interpolation results, since few (if any) real-world data will have that level of precision. The offset addition could be created as a function file (with ‘v’ and the input and ‘vi’ as the output) if you have this problem frequently with your data.
I kept the two ‘test’ assignments in my posted code to demonstrate the incremental addition. Delete them in the code you use.
Also, the offset values are small enough (by design in my approach here) that it will not be noticeable with most format settings.
Adding eps to your elements to make them non-dups will only cause terribly nasty things to happen to your interpolant if you use a spline interpolant.
Note that this is NOT the usual way to solve the problem. In fact, it can be a very bad idea.
@John — I don’t doubt that you’re correct. I know of no other way to approach this problem.
No, I think he's right, one eps is to small and will get round off with bigger values:
>> (10+eps)==10
ans =
logical
1
I used 1e-9 in my use case, but that only works with numbers < 1000. The chosen magnitude should not be more than 15 times smaller, because that is roundabout the precision with double floats.

Sign in to comment.

More Answers (1)

When you have duplicates, so multiple x's with different values for y, interpolation makes no sense since an interpolant requires that it returns the value of y at the given x. If you have multiple y's there, then what do you return?
Even if all of the y values at that point are the same, interpolation tools will not be easily able to diagnose this fact. So they get upset at you.
If you add eps to the x's to make them distinct, then interpolation will be numerically difficult. There will be nasty things that happen. Worse, if you are using a spline method to interpolate, expect insanity, because the spline interpolant will be wildly oscillatory. Adding eps is a BAD idea. Sorry, but it is.
Instead, recognize that if you have multiple y values for a given x, that interpolation is impossible. Instead, just form the average value of y for any reps. Replace the multiple reps with a SINGLE value, with the average y at that point. This is now fully consistent with interpolation. It is as good as you can do. In fact, it is this scheme that is usually recommended by those who do interpolation. (Spoken as a person who worked with interpolation tools for 29 years, and was the go-to person for this class of problems, tasked with providing those tools to a large corporation that used interpolation extensively.)
I've provided a tool ( consolidator , found on the file exchange) that does exactly what you need, replacing all replicates with a single point.

Categories

Find more on Interpolation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!