Fitting Data with a Discontinuity

This example shows how to fit an equation to data which has a sudden discontinuity using the Curve Fitting Toolbox.

Setting Up the Data

Load in the sample data, which is noisy and contains a discontinuity.

sample = sample';
x = (1:numel(sample))';

clf
plot(x,sample)
set(gca,'Ylim',[-.05 .05])
legend({'data'}) Smoothing the Data

Smooth the data using a moving average filter of length 200. Please note that CUMSUM can be used instead of using a moving average filter.

y = smooth(sample, 10, 'rlowess');
filt_len = 200;
a = 1;
b = (1/filt_len)*ones(1,filt_len);
mov_avg = filter(b,a,y);

hold on
plot(mov_avg,'r','LineWidth',2)
legend({'data','moving average of length 200'}) Finding the Discontinuity

Find the difference between element(i) and element (i+100) because the changes are very gradual, but there is continuous decrease when there is sudden drop as shown in the red line.

diff100 = mov_avg(1:end-100) - mov_avg(101:end);
[~,p] = max(abs(diff100));

disp(p)
1498

Creating the Heaviside Step Function

The Heaviside step function is defined: It's implemented here as heaviside.m:

type heaviside
function result = heaviside(input)
result = input;
result(input > 1) = 1;
result(input < 1) = 0;
result = result(:);

Creating the Custom Equation

Create a custom equation which incorporates the Heaviside tuned to p.

equation = ['a*heaviside(x-', num2str(p),')*exp(-b*(x-', num2str(p),'))+c'];

disp(equation)
a*heaviside(x-1498)*exp(-b*(x-1498))+c

Fitting the Data

st_ = [0 0 0];
outliers = excludedata(x,sample,'range',[-0.03 0.03]);
fo_ = fitoptions('method','NonlinearLeastSquares',...
'Lower',[-1 -1 -1],...
'Upper',[1 1 1], ...
'Startpoint',st_, ...
'exclude', outliers);
ft_ = fittype(equation, ...
'dependent',{'y'},'independent',{'x'},...
'coefficients',{'a', 'b', 'c'}); % Fit this model using new data
[cf, gof] = fit(x,sample,ft_,fo_);

plot(x, cf(x), 'g', 'linewidth',2);
legend({'data', 'moving average of length 200','fitted curve'}) An Alternative: Using an Continuous Approximation of the Heaviside

Fitting to the Heaviside function is tough because it produces a discontinuous result. An approach to model it could be to use a continuous replacement.

alternativeEquation = 'a*(1+.5*erf((x-p)/s))*exp(-b*(x-p))+c';

disp(alternativeEquation)
a*(1+.5*erf((x-p)/s))*exp(-b*(x-p))+c

Notice that heaviside(x-p) is replaced with (1+.5*erf((x-p)/s))).

This also goes from 0 up to 1, it rises most quickly at p, and it approaches the Heaviside function as s approaches 0. A large starting value for s will help the algorithm along.

ft2 = fittype(alternativeEquation, ...
'dependent',{'y'},'independent',{'x'}, ...
'coefficients',{'a','b','c','p','s'});
fo2 = fitoptions('method','NonlinearLeastSquares',...
'Lower',[-1 -1 -1],...
'Upper',[1 1 1], ...
'Startpoint',[0 0 0 1000 1000], ...
'exclude', outliers);
[cf2, gof2] = fit(x,sample,ft2,fo2);

clf
plot(x, sample);
hold on
plot(x,cf2(x),'g','linewidth',2);
set(gca,'Ylim',[-.05 .05])
legend({'data','fitted curve'}) 