finding singular outliers in the presence of data with steep changes but not singular
4 views (last 30 days)
Finding outliers of the type, that just a singular values significantly sticks out of the data aroung.
But the complicating factor is, that there sometimes are steep changes in the data. But these are embedded in the context of many data following the sudden new trend. That are not singular data points.
but that throws out by far too many of the data from the steep changes, not only the singular outliers.
Steven Lord on 6 Dec 2022
I think the outlier detection and removal functions in MATLAB are the right tools for you to use. Choosing the right parameters (detection method and thresholds) can be a challenge. That's one of the purposes for which the Clean Outlier Data task was created.
Open the Live Editor. Read in your data then open the task as per the instructions in the Open the Task section on that documentation page. Then tell the task the data on which it should operate and experiment with the various detection methods and parameters for those detection methods until they detect the points that you want to be considered outliers without ignoring those that look outlier-like but aren't. Once you have the parameters set the way you want, you can look at the code so you can use it for a different but similar data set in the future.
More Answers (1)
Mathieu NOE on 6 Dec 2022
this is my result so far
it will not look at the data in the first and last 10% of the time vector so thefocus is on the rafale of peaks in the second half
x = (1:numel(Pressure));
[dy, ddy] = firstsecondderivatives(x,Pressure);
% do not look at first and last 10% (of total signal duration) samples
n_start = round(0.1*numel(Pressure));
n_end = round(0.1*numel(Pressure));
ddy(1:n_start) = 0;
ddy(end-n_end:end) = 0;
ddy = abs(ddy);
threshold = 1;
x_zc = round(find_zc(x,ddy,threshold));
% keep only first and last index to get start / stop index of window
% and make the window a bit larger with
% 100 samples before and after
x_zc = [x_zc(1)-100 x_zc(end)+100];
y_filtered = Pressure ;
y_filtered(x_zc(1):x_zc(end)) = filloutliers(Pressure(x_zc(1):x_zc(end)),'linear','movmean',100);
function [Zx] = find_zc(x,y,threshold)
% positive slope "zero" crossing detection, using linear interpolation
y = y - threshold;
zci = @(data) find(diff(sign(data))>0); %define function: returns indices of +ZCs
ix=zci(y); %find indices of + zero crossings of x
ZeroX = @(x0,y0,x1,y1) x0 - (y0.*(x0 - x1))./(y0 - y1); % Interpolated x value for Zero-Crossing
Zx = ZeroX(x(ix),y(ix),x(ix+1),y(ix+1));
function [dy, ddy] = firstsecondderivatives(x,y)
% The function calculates the first & second derivative of a function that is given by a set
% of points. The first derivatives at the first and last points are calculated by
% the 3 point forward and 3 point backward finite difference scheme respectively.
% The first derivatives at all the other points are calculated by the 2 point
% central approach.
% The second derivatives at the first and last points are calculated by
% the 4 point forward and 4 point backward finite difference scheme respectively.
% The second derivatives at all the other points are calculated by the 3 point
% central approach.
n = length (x);
dy = zeros;
ddy = zeros;
% Input variables:
% x: vector with the x the data points.
% y: vector with the f(x) data points.
% Output variable:
% dy: Vector with first derivative at each point.
% ddy: Vector with second derivative at each point.
dy(1) = (-3*y(1) + 4*y(2) - y(3)) / (2*(x(2) - x(1))); % First derivative
ddy(1) = (2*y(1) - 5*y(2) + 4*y(3) - y(4)) / (x(2) - x(1))^2; % Second derivative
for i = 2:n-1
dy(i) = (y(i+1) - y(i-1)) / (x(i+1) - x(i-1));
ddy(i) = (y(i-1) - 2*y(i) + y(i+1)) / (x(i-1) - x(i))^2;
dy(n) = (y(n-2) - 4*y(n-1) + 3*y(n)) / (2*(x(n) - x(n-1)));
ddy(n) = (-y(n-3) + 4*y(n-2) - 5*y(n-1) + 2*y(n)) / (x(n) - x(n-1))^2;