# Interpolate percentages - maintain sum of rows =1, maintain all values >0

27 views (last 30 days)
Edward Byers on 20 Apr 2013
Hi
I would like to interpolate data that are percentages, that change at various points in time.
For each row (year) in the data, the sum of the columns must =100%
i.e.
A B C D Sum
1990 10 40 40 10 100
1995 5 35 32 18 100
1998 0 32 44 24 100
2005 0 37 55 8 100
The data above is just an example.
I have been using interp1, but understand that there is no linkage between the columns
When I use linear interpolation, this works fine - all points are above 0, sum of each row = 100.
Using other methods:
• cubic - all points are above 0, but the rows do not sum 100.
• spline - some points go negative, but rows sum 100.
The problem is that the approximation oscillates around 0
What other interpolation method can I use (that isn't linear) where I can set conditions to prevent the approximation going below 0?
Thanks
Mel on 21 Apr 2013
If you are trying to predict future fuel usage, you may want to include other factors. For example, unless they can make non-diesel generators and ship them to the Arctic fairly inexpensively, you will always have a baseline diesel amount (I know it is not a lot of people, but Canada's Arctic pretty much runs off of diesel at the moment). If you are looking at a particular area (instead of the world), you may want to consider other factors in that area.
If your rows do not sum to 100, you could always multiply them by a scalar quantity so that they do (sort of like normalizing the sum).
Maybe look at what shape your initial data has and see what form fits it best (semilinear, linear, quasilinear, quadratic, exponential, etc.) To keep data from following an exact curve (so it fluctuates from year to year), you can add in random normal noise with an appropriate std.
You could also set up a projection scheme using differential equations, such that if the usage of a particular fuel went up a lot recently it will continue to or not continue to, depending on how you set up your equations.

Matt J on 21 Apr 2013
Edited: Matt J on 21 Apr 2013
Raised cosine interpolation would be a possibility. For data xi spaced apart by 1, this would be
y(x)=sum_i y_i h(x-xi)
where
h(x)= 0.5*cos(x/pi)+0.5, abs(x)<=1
= 0, otherwise
but AFAIK there is no pre-packaged MATLAB function that offers it. You would have to code it from scratch. What's wrong with linear interpolation? What do you mean by "oscillates around zero"?
Matt J on 21 Apr 2013
For non-uniformly spaced data the extension is fairly simple. E.g., interpolating at x with x1<=x<=x2, one would do
delta= x2-x1;
weight= 0.5*cos(x/pi/delta)+0.5;
y=weight*y1+(1-weight)*y2