Fitting Data with a Discontinuity

This example shows how to fit an equation to data which has a sudden discontinuity using the Curve Fitting Toolbox.


Setting Up the Data

Load in the sample data, which is noisy and contains a discontinuity.

load sample_data sample;
sample = sample';
x = (1:numel(sample))';

set(gca,'Ylim',[-.05 .05])

Smoothing the Data

Smooth the data using a moving average filter of length 200. Please note that CUMSUM can be used instead of using a moving average filter.

y = smooth(sample, 10, 'rlowess');
filt_len = 200;
a = 1;
b = (1/filt_len)*ones(1,filt_len);
mov_avg = filter(b,a,y);

hold on
legend({'data','moving average of length 200'})

Finding the Discontinuity

Find the difference between element(i) and element (i+100) because the changes are very gradual, but there is continuous decrease when there is sudden drop as shown in the red line.

diff100 = mov_avg(1:end-100) - mov_avg(101:end);
[~,p] = max(abs(diff100));


Creating the Heaviside Step Function

The Heaviside step function is defined:

$$H(x) = \displaystyle \left\{\begin {array}{ll} 0 & \mbox{when}\,\, x< 0, \\
1/2 & \mbox{when}\,\, x= 0,\\
1 & \mbox{when}\,\, x> 0.\\
\end{array} \right.$$

It's implemented here as heaviside.m:

type heaviside
function result = heaviside(input)
result = input;
result(input > 1) = 1;
result(input < 1) = 0;
result = result(:);

Creating the Custom Equation

Create a custom equation which incorporates the Heaviside tuned to p.

equation = ['a*heaviside(x-', num2str(p),')*exp(-b*(x-', num2str(p),'))+c'];


Fitting the Data

st_ = [0 0 0];
outliers = excludedata(x,sample,'range',[-0.03 0.03]);
fo_ = fitoptions('method','NonlinearLeastSquares',...
    'Lower',[-1 -1 -1],...
    'Upper',[1 1 1], ...
    'Startpoint',st_, ...
    'exclude', outliers);
ft_ = fittype(equation, ...
    'coefficients',{'a', 'b', 'c'}); % Fit this model using new data
[cf, gof] = fit(x,sample,ft_,fo_);

plot(x, cf(x), 'g', 'linewidth',2);
legend({'data', 'moving average of length 200','fitted curve'})

An Alternative: Using an Continuous Approximation of the Heaviside

Fitting to the Heaviside function is tough because it produces a discontinuous result. An approach to model it could be to use a continuous replacement.

alternativeEquation = 'a*(1+.5*erf((x-p)/s))*exp(-b*(x-p))+c';


Notice that heaviside(x-p) is replaced with (1+.5*erf((x-p)/s))).

This also goes from 0 up to 1, it rises most quickly at p, and it approaches the Heaviside function as s approaches 0. A large starting value for s will help the algorithm along.

ft2 = fittype(alternativeEquation, ...
    'dependent',{'y'},'independent',{'x'}, ...
fo2 = fitoptions('method','NonlinearLeastSquares',...
    'Lower',[-1 -1 -1],...
    'Upper',[1 1 1], ...
    'Startpoint',[0 0 0 1000 1000], ...
    'exclude', outliers);
[cf2, gof2] = fit(x,sample,ft2,fo2);

plot(x, sample);
hold on
set(gca,'Ylim',[-.05 .05])
legend({'data','fitted curve'})