Second derivative from a smoothing spline fit

50 views (last 30 days)
Hi!
I have the following fit curve that I approximate using the Curve Fitting toolbox:
And I want to find the points (Volume, Price) where the curve changes from concave to convex. Is there an easy way to find this?
Here is my code and my data file.
Thanks in advance!
%% Fit: 'Jan 2012'.
[xData, yData] = prepareCurveData( x_jan_12_s, Price_jan_12_s );
% Set up fittype and options.
ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [2 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.24530273269464e-08;
opts.Exclude = excludedPoints;
% Fit model to data.
[fitresult{2}, gof(2)] = fit( xData, yData, ft, opts );
% Plot fit with data.
figure( 'Name', 'Jan 2012 Supply France Peak' );
plot( fitresult{2}, 'k');
% Label axes
xlabel( 'Volume [MWh]', 'Interpreter', 'none' );
ylabel( 'Price', 'Interpreter', 'none' );
%Also this curve fit can be read as:
f = fit(xData, yData,'smoothingspline','SmoothingParam',8.24530273269464e-08)

Accepted Answer

darova
darova on 5 Apr 2020
Here you go
n = 50;
x = linspace(0,10,n);
y = sin(x).*x;
d2y = diff(y,2)./(x(2)-x(1))/2; % second derivative
ix = abs( diff(sign(d2y)) ) > 0.01; % find changes of sign
ind = 1 + find(ix); % appropriate indices in X,Y
fv.vertices = [x(2:end-1); y(2:end-1)]'; % without border nodes
fv.faces = [1:n-3; 2:n-2]'; % connections
fv.faceVertexCData = d2y(:); % color in nodes
fv.edgecolor = 'interp'; % interpolate colors
plot(x(ind),y(ind),'or')
patch(fv)
axis equal
colorbar
  6 Comments
Angelavtc
Angelavtc on 6 Apr 2020
what do you mean @darova? could you please be more precise? Sorry, I am kind of new using matlab and everything is really useful to learn more. Thanks in advance!
darova
darova on 6 Apr 2020
I mean that if you want only two colors: negative and positive, you can use sign function (returns only -1,0 and +1)
fv.faceVertexCData = sign(d2y(:)); % color in nodes (only -1 and 1 two colors)
Maybe you want to turn off color interpolation also
fv.edgecolor = 'flat'; % interpolate colors

Sign in to comment.

More Answers (1)

John D'Errico
John D'Errico on 5 Apr 2020
Edited: John D'Errico on 5 Apr 2020
It is important to recognize that IF you have the curve fitting toolbox, then there are various ways to accomplish that you were looking for, arguably better ways.
As well, be careful in how you use splines. The curve shown in your question seems to be one that classical splines will often fail on, since they will exhibit ringing. Not unlike the Gibbs phenomena I recall from working with Fourier series. Sharp transitions in a curve will cause serious problems when you want to model the data.
plot(t,p,'o-')
UGH. You have ONE data point down at the bottom, then a massive transition to a flat region. This is EXACTLY the kind of data where you do NOT want to use a smoothing spline!!!!!!!!
Note that bump at the top the initial transition in your curve as plotted? That is not part of your curve. It is spurious garbage, introduced by the spline. The spline fit down there is simply wrong, faiing to match any kind of behavior.
Are you kidding? I'm sorry, but you are kidding yourself in this. A smoothing spline is a terribly poor choice to fit that data, IF you include that first data point. It does very little smoothing in the rest of the curve, while introducing garbage at the bottom.
You would be far better off if you just completely dropped the first data point from any analysis. Surprisingly, it seems like you decided to drop the SECOND data point in your data? Given that the second data point is actually consistent with the rest of the data, this seems utterly strange.
I can possibly understand that you decided to drop the next to last data point, as the time stamp listed for that point seems a bit strange. At the same time, the curve is clearly starting to roll over at the top.
ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [1 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.e-08;
opts.Exclude = excludedPoints;
[fitresult, gof] = fit( xData,yData, ft, opts );
plot(fitresult)
hold on
plot(xData,yData,'go')
Now, it appears you want to find the point where the curve transitions from concave to convex? I would point out there are actually several transitions in curvature here. Lets look at the second derivative. This is a spline. You don't need to use an approximate differentiation using a finite difference!!!! You have the curve fitting toolbox, and it can do the differentiation for you.
help cfit
cfit Curve Fit Object
A cfit object encapsulates the result of fitting a curve to data.
Use the FIT function or CFTOOL to create cfit objects.
You can treat a cfit object as a function to make predictions or
evaluation the curve at values of X. For example to predict what
the population was in 1905, then use
load census
cf = fit( cdate, pop, 'poly2' );
yhat = cf( 1905 )
cfit methods:
coeffvalues - Get values of coefficients.
confint - Compute prediction intervals for coefficients.
differentiate - Compute values of derivatives.
feval - Evaluate at new observations.
integrate - Compute values of integral.
plot - Plot a curve fit.
predint - Compute prediction intervals for a curve fit or for
new observations.
probvalues - Get values of problem parameters.
See also: fit, fittype, sfit.
Documentation for cfit
>> help cfit/differentiate
differentiate Differentiate a fit result object.
DERIV1 = differentiate(FITOBJ,X) differentiates the model FITOBJ at the
points specified by X and returns the result in DERIV1. FITOBJ is a Fit
object generated by the FIT or CFIT function. X is a vector. DERIV1 is
a vector with the same size as X. Mathematically speaking, DERIV1 =
D(FITOBJ)/D(X).
[DERIV1,DERIV2] = differentiate(FITOBJ, X) computes the first and
second derivatives, DERIV1 and DERIV2 respectively, of the model
FITOBJ.
See also cfit/integrate, fit, cfit.
Documentation for cfit/differentiate
Differentiation is thus quite easy.
xdiff = linspace(xData(2),xData(end),500);
[~,fitdiff2] = differentiate(fitresult,xdiff);
plot(xdiff,fitdiff2,'-')
Ugh. What point are you trying to locate now? That transition point would be where the second derivative changes sign. So next, I'll add a reference line at zero, and then expand the y axis to show the behavior near y==0.
So, exactly where do you think this curve exhibits the behavior you want to see? Remember, a postive second derivative region is one where the curve is concave upwards. Even having dropped that first data point, you will not easily find the point you want to find. I think you are kidding yourself.
Derivative estimation is a noise amplification process, a classically ill-posed problem. Estimation of the second derivative is thus also classicly ill-posed, on steroids.
  3 Comments
John D'Errico
John D'Errico on 6 Apr 2020
Edited: John D'Errico on 6 Apr 2020
Yes. The data you showed was just noisy enough that it becomes difficult to know cleanly where your curvature changes. Were your data more noisy yet, this becomes more difficult. Then you would need to use more smoothing. As I said, differentiation is a noise amplifier. It really does not matter that much how you differentiate the spline, except that if you use an approximate derivative, then you introduce MORE problems into the problem, because it is now only an approximation.
Anyway, it can become more a problem not of how you differentiate the spline, but of how you build the spline. Too little smoothing can yield a very noisy second derivative estimate, thus making it very difficult to locate what you want. Too much smoothing can blur things too greatly, making it difficult to find where this event happens.
Ill-posed problems are ones that can amplify even the smallest, tiniest noise in the data. Differentiation is exactly that, ill-posed in nature. If you think about it, the definition of a derivative has you divide a tiny change in y, by an also tiny change in x, thus deltay/deltax. That is clearly ill-posed. But now, you want to compute a second derivative, which is most easily viewed as the derivative of the first derivative. So you are now amplifying the noise in a process that already amplifies noise in your data.
This is why I say how you compute the second derivative might be of less importance than how you built the spline. At the same time, since it is so easy to compute the second derivative (splines really are easy to differentiate if you understand how to work with them) there is no reason to use an approximation, which can at best just muddy the waters more than they are already.
And finally, even in the example we have here, you need to see that garbage in the data is important to clean up, if you are trying to find a subtle signal in problem data. You also need to understand how spline modeling works, in the sense I discussed in my answer that splines don't handle rapid transients well. When faced with a rapid change in behavior, a spline will tend to tend to respond with ringing (rapid bell-like oscillations near the transient) in the model. That ringing can be significant, as we saw in the data you gave us.
Angelavtc
Angelavtc on 6 Apr 2020
Edited: Angelavtc on 6 Apr 2020
Wonderful @John D'Errico, thanks for the explanation, it is really useful for me, specially because I dont have that level of experience using splines :) Have a great day!

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!