Accurate numerical differentiation for large time series data!

30 views (last 30 days)
I'm trying to compute the derivative of a time series. I attached its data to this question. I looked for same question in ‘Ask & Answer’ of Matlab central. First of all, I would summarize most of them here:
1. Forward or backward derivation may not be good enough. They could remain smooth, if f(Xi) on both side of Xi is equal.
2. Derivative over large interval (Delta-X) using Matlab-Function diff (), central derivation. This function has higher order of accuracy, but still dominated easily by noise.
3. Fit a function (model for curve which should be fitted) to the data and then derivate the function. The good choices of models are:
• Interpolating spline --> The Matlab command spline works in Matlab workspace as a cubic spline interpolation, which make a polynomial of 3. Order between two contiguous points. It involves a simply solved linear system of equation and therefore avoids oscillation. However, it can’t do the differentiation around a singularity point. Lagrange Interpolating Polynomial can do it poorly specially with higher order polynomials. But, the more data used in interpolation, the higher the degree of resulting polynomial, which leads to greater oscillation between data points. Specially when the interval (Delta-X) between two contiguous point is the same over the whole of data points. Therefore, use 3. Or 5. order of LIP. For noisily data lower order approximation will be the best.
• PCHIP --> If we want to avoid the high oscillation between each data point and the problem of differentiation around the singularity point, use PCHIP (as a spline which represent a curve with singularities). It is a far better approximation than simple cubic spline. However, both methods are not sensitive to noise. I know that differentiation amplifies any noise in my data. There are two Suggestions to avoiding noise:
* if worry about noise, use smoothing spline and then differentiate the spline model. If your data has noise, you need the kind of smoothing.
*
* Before using Spline-Toolbox, pre-filter (signal processing toolbox) the data. Use some filter design like Cremez, low pass filter (is known as moving average filtering in Matlab). Filtering of data the problem more complex.
4. Use mean over a period instead of original data (I didn’t still understand, what does it mean!!)
5. Savitsky-Golay are useful to build derivative estimates of time series. They are fast and efficient for long series of data, as they can be computed using the function filter.
6. Least-square approximation together with Sobolev-norm. Least-square fitting has the advantage of smoothing out or filtering the errors. Information regarding to the data:
• There isn’t any singularity in the data. • Data may have noise • Long series of data (size of vector= (9817*1)
However, I am still confused to use above mentioned methods! Should I use only smoothing spline through Curve fitting tool? Which command should be used to derivate the fitted model? If I use the command f=spline (x,y) (cubic spline interpolation), I could use the command ‘fnder’ to derivate f easily, but it isn’t sensitive to the noise. I don’t know at all, how to use filters such as Savitsky-Golay, Cremez. I’m sorry for the long History! I would be pleased, if anyone would help me!

Accepted Answer

John D'Errico
John D'Errico on 28 Apr 2016
Edited: John D'Errico on 28 Apr 2016
I am surely one of the people who answered at least some of the questions you have found, but I'm not sure what is your problem. It looks like you are overwhelmed with what you have not really learned.
As you have learned, there are many ways to solve a problem of this sort. But since you have a long time series, the best way to solve the problem with a noisy series is Savitsky-Golay. It does smoothing as well as differentiation in one step, and to some extent, you can control the amount of smoothing to be done.
You don't even need to know how to write such a tool. Just look on the file exchange. I've even got one tool on there for this class of problem:<http://www.mathworks.com/matlabcentral/fileexchange/16997-movingslope movingslope>.
The call will be as simple as:
dydx = movingslope(y,supportlength,modelorder);
Here modelorder is an integer, which defines the order of the local polynomial model to be employed. modelorder=1 or modelorder=2 will be entirely adequate almost always. With very long sliding windows, you might use a somewhat higher order model, perhaps as much as 4.
supportlength is an integer. larger values will give more smoothing. It must be larger than modelorder. If your data is quite noisy, then you will need a long supportlength.
As an example, with VERY noisy data...
x = 0:10000;
y = sin(x/500) + randn(size(x))/10;
plot(x,y,'.')
yprime = movingslope(y,500,2);
The result should look reasonably close to 0.002*cos(0.002*x). In fact, it did quite well, except for those peaks and valleys, where the window length was too large for a locally quadratic model to be sufficient. So using an even longer window length to get sufficient smoothing, plus a 4th order model, we can get this estimate:
yprime = movingslope(y,1000,4);
plot(x,yprime,'b-',x,0.002*cos(x*0.002),'-r')
It missed just a bit near the start, due to the relatively high curvature there, and just natural randomness of the data.
  2 Comments
Mahdi
Mahdi on 1 May 2016
Hi John, tanks a lot for your answer. The function, which I downloded from the following link didn't work properly:
output of your function is just a line! I have written a small code for comparing of the precision of diff() derivative and your function. Please read and run this M-file.
John D'Errico
John D'Errico on 2 May 2016
Edited: John D'Errico on 2 May 2016
I'm sorry, but why do you think that what you did was an accurate test? Yes. It did work properly. You did not bother to read the help, so your test was invalid. You compared apples to oranges, and found a difference.
You had several serious problems in that test. This is what you did:
x=-10:0.067:10;
...
y_prime_matlab=movingslope(y,299,2);
How many points are in the vector x?
numel(x)
ans =
299
Why would you use as sliding window of length 299 on a vector of length 299? There is no room for it to slide.
So your comparison is simply incorrect, for more than one reason!
The second problem in your test is the failure to use the 4th argument to movingslope.
Your series has an increment of 0.067. In the test case I showed on that answer, my series had an increment of 1.
Movingslope works on a time series, so the normal expectation is the points have a stride of 1, so the difference in x between each point. If the series has astride of 0.067, how can movingslope possibly know how you generated it? The only way is to provide that information, and that is why there is a 4th input argument to movingslope. You are using movingslope to differentiate a function, not a time series. So only you and god knows what the stride is, unless you provide that information.
Instead, the call to movingslope for your problem might have been something like this:
y_prime_matlab=movingslope(y,4,3,0.067);
Here I arbitrarily chose a short window length because you have no noise in the series. The polynomial model order is one less than the window length. Since there is no need to allow for noise, this makes a higher order approximation to the estimate of the derivative.
You are differentiating a function. I picked a sliding cubic, to use to estimate the derivative. Note that there will be some larger errors at the ends, because the estimate is not as good there. One might play with the window length and order a bit. So, for example, try out these variations:
y_prime_matlab=movingslope(y,2,1,0.067);
y_prime_matlab=movingslope(y,3,2,0.067);
y_prime_matlab=movingslope(y,5,4,0.067);
y_prime_matlab=movingslope(y,6,5,0.067);
The longer windows can support a nigher order approximation, but a longer window has more edge effects at each end.
Sorry, but if you are going to test a function, you should read the help first.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!