Second derivative from a smoothing spline fit

Question

Angelavtc on 5 Apr 2020

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/515523-second-derivative-from-a-smoothing-spline-fit

Commented: darova on 6 Apr 2020

Accepted Answer: darova

data.xlsx

Open in MATLAB Online

Hi!

I have the following fit curve that I approximate using the Curve Fitting toolbox:

And I want to find the points (Volume, Price) where the curve changes from concave to convex. Is there an easy way to find this?

Here is my code and my data file.

Thanks in advance!

%% Fit: 'Jan 2012'.
[xData, yData] = prepareCurveData( x_jan_12_s, Price_jan_12_s );
% Set up fittype and options.
ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [2 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.24530273269464e-08;
opts.Exclude = excludedPoints;
% Fit model to data.
[fitresult{2}, gof(2)] = fit( xData, yData, ft, opts );
% Plot fit with data.
figure( 'Name', 'Jan 2012 Supply France Peak' );
plot( fitresult{2}, 'k');
% Label axes
xlabel( 'Volume [MWh]', 'Interpreter', 'none' );
ylabel( 'Price', 'Interpreter', 'none' );
%Also this curve fit can be read as:
f = fit(xData, yData,'smoothingspline','SmoothingParam',8.24530273269464e-08)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

darova on 5 Apr 2020

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/515523-second-derivative-from-a-smoothing-spline-fit#answer_424144

Open in MATLAB Online

Here you go

n = 50;
x = linspace(0,10,n);
y = sin(x).*x;
d2y = diff(y,2)./(x(2)-x(1))/2;          % second derivative
ix = abs( diff(sign(d2y)) ) > 0.01;      % find changes of sign
ind = 1 + find(ix);                      % appropriate indices in X,Y
fv.vertices = [x(2:end-1); y(2:end-1)]'; % without border nodes
fv.faces = [1:n-3; 2:n-2]';              % connections
fv.faceVertexCData = d2y(:);             % color in nodes
fv.edgecolor = 'interp';                 % interpolate colors
plot(x(ind),y(ind),'or')
patch(fv)
axis equal
colorbar

6 Comments
Show 4 older commentsHide 4 older comments

Angelavtc on 6 Apr 2020

what do you mean @darova? could you please be more precise? Sorry, I am kind of new using matlab and everything is really useful to learn more. Thanks in advance!

darova on 6 Apr 2020

Open in MATLAB Online

I mean that if you want only two colors: negative and positive, you can use sign function (returns only -1,0 and +1)

fv.faceVertexCData = sign(d2y(:));             % color in nodes (only -1 and 1 two colors)

Maybe you want to turn off color interpolation also

fv.edgecolor = 'flat'; % interpolate colors

Sign in to comment.

Answer 2

John D'Errico on 5 Apr 2020

1
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/515523-second-derivative-from-a-smoothing-spline-fit#answer_424198

Edited: John D'Errico on 5 Apr 2020

Open in MATLAB Online

It is important to recognize that IF you have the curve fitting toolbox, then there are various ways to accomplish that you were looking for, arguably better ways.

As well, be careful in how you use splines. The curve shown in your question seems to be one that classical splines will often fail on, since they will exhibit ringing. Not unlike the Gibbs phenomena I recall from working with Fourier series. Sharp transitions in a curve will cause serious problems when you want to model the data.

plot(t,p,'o-')

UGH. You have ONE data point down at the bottom, then a massive transition to a flat region. This is EXACTLY the kind of data where you do NOT want to use a smoothing spline!!!!!!!!

Note that bump at the top the initial transition in your curve as plotted? That is not part of your curve. It is spurious garbage, introduced by the spline. The spline fit down there is simply wrong, faiing to match any kind of behavior.

Are you kidding? I'm sorry, but you are kidding yourself in this. A smoothing spline is a terribly poor choice to fit that data, IF you include that first data point. It does very little smoothing in the rest of the curve, while introducing garbage at the bottom.

You would be far better off if you just completely dropped the first data point from any analysis. Surprisingly, it seems like you decided to drop the SECOND data point in your data? Given that the second data point is actually consistent with the rest of the data, this seems utterly strange.

I can possibly understand that you decided to drop the next to last data point, as the time stamp listed for that point seems a bit strange. At the same time, the curve is clearly starting to roll over at the top.

ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [1 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.e-08;
opts.Exclude = excludedPoints;
[fitresult, gof] = fit( xData,yData, ft, opts );
plot(fitresult)
hold on
plot(xData,yData,'go')

Now, it appears you want to find the point where the curve transitions from concave to convex? I would point out there are actually several transitions in curvature here. Lets look at the second derivative. This is a spline. You don't need to use an approximate differentiation using a finite difference!!!! You have the curve fitting toolbox, and it can do the differentiation for you.

help cfit
  cfit   Curve Fit Object
 
    A cfit object encapsulates the result of fitting a curve to data.
    Use the FIT function or CFTOOL to create cfit objects.
 
    You can treat a cfit object as a function to make predictions or
    evaluation the curve at values of X. For example to predict what
    the population was in 1905, then use
 
        load census
        cf = fit( cdate, pop, 'poly2' );
        yhat = cf( 1905 )
 
    cfit methods:
        coeffvalues   - Get values of coefficients.
        confint       - Compute prediction intervals for coefficients.
        differentiate - Compute values of derivatives.
        feval         - Evaluate at new observations.
        integrate     - Compute values of integral.
        plot          - Plot a curve fit.
        predint       - Compute prediction intervals for a curve fit or for 
                        new observations.
        probvalues    - Get values of problem parameters.
 
    See also: fit, fittype, sfit.
    Documentation for cfit
>> help cfit/differentiate
 differentiate  Differentiate a fit result object.
    DERIV1 = differentiate(FITOBJ,X) differentiates the model FITOBJ at the
    points specified by X and returns the result in DERIV1. FITOBJ is a Fit
    object generated by the FIT or CFIT function. X is a vector. DERIV1 is
    a vector with the same size as X. Mathematically speaking, DERIV1 =
    D(FITOBJ)/D(X).
 
    [DERIV1,DERIV2] = differentiate(FITOBJ, X) computes the first and
    second derivatives, DERIV1 and DERIV2 respectively, of the model
    FITOBJ.
 
    See also cfit/integrate, fit, cfit.
    Documentation for cfit/differentiate

Differentiation is thus quite easy.

xdiff = linspace(xData(2),xData(end),500);
[~,fitdiff2] = differentiate(fitresult,xdiff);
plot(xdiff,fitdiff2,'-')

Ugh. What point are you trying to locate now? That transition point would be where the second derivative changes sign. So next, I'll add a reference line at zero, and then expand the y axis to show the behavior near y==0.

So, exactly where do you think this curve exhibits the behavior you want to see? Remember, a postive second derivative region is one where the curve is concave upwards. Even having dropped that first data point, you will not easily find the point you want to find. I think you are kidding yourself.

Derivative estimation is a noise amplification process, a classically ill-posed problem. Estimation of the second derivative is thus also classicly ill-posed, on steroids.

3 Comments
Show 1 older commentHide 1 older comment

John D'Errico on 6 Apr 2020

Edited: John D'Errico on 6 Apr 2020

Yes. The data you showed was just noisy enough that it becomes difficult to know cleanly where your curvature changes. Were your data more noisy yet, this becomes more difficult. Then you would need to use more smoothing. As I said, differentiation is a noise amplifier. It really does not matter that much how you differentiate the spline, except that if you use an approximate derivative, then you introduce MORE problems into the problem, because it is now only an approximation.

Anyway, it can become more a problem not of how you differentiate the spline, but of how you build the spline. Too little smoothing can yield a very noisy second derivative estimate, thus making it very difficult to locate what you want. Too much smoothing can blur things too greatly, making it difficult to find where this event happens.

Ill-posed problems are ones that can amplify even the smallest, tiniest noise in the data. Differentiation is exactly that, ill-posed in nature. If you think about it, the definition of a derivative has you divide a tiny change in y, by an also tiny change in x, thus deltay/deltax. That is clearly ill-posed. But now, you want to compute a second derivative, which is most easily viewed as the derivative of the first derivative. So you are now amplifying the noise in a process that already amplifies noise in your data.

This is why I say how you compute the second derivative might be of less importance than how you built the spline. At the same time, since it is so easy to compute the second derivative (splines really are easy to differentiate if you understand how to work with them) there is no reason to use an approximation, which can at best just muddy the waters more than they are already.

And finally, even in the example we have here, you need to see that garbage in the data is important to clean up, if you are trying to find a subtle signal in problem data. You also need to understand how spline modeling works, in the sense I discussed in my answer that splines don't handle rapid transients well. When faced with a rapid change in behavior, a spline will tend to tend to respond with ringing (rapid bell-like oscillations near the transient) in the model. That ringing can be significant, as we saw in the data you gave us.

Angelavtc on 6 Apr 2020

Edited: Angelavtc on 6 Apr 2020

Wonderful @John D'Errico, thanks for the explanation, it is really useful for me, specially because I dont have that level of experience using splines :) Have a great day!

Sign in to comment.

Second derivative from a smoothing spline fit

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

6 Comments
Show 4 older commentsHide 4 older comments

More Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Community Treasure Hunt

Second derivative from a smoothing spline fit

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

6 Comments Show 4 older commentsHide 4 older comments

More Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

6 Comments
Show 4 older commentsHide 4 older comments

3 Comments
Show 1 older commentHide 1 older comment