How to replace values above a given percentile by nans
Show older comments
Hi,
I am trying to replace values above the 99th percentile in a table by nans by using Star Strider excellent little function (https://www.mathworks.com/matlabcentral/answers/127944-how-to-pick-the-j-th-percentile-of-a-vector). An example of my code looks like so:
% Table with nan
t = array2table(vertcat(horzcat([1:1000]', [1001:2000]'), NaN(10, 2)));
% Replace values above 99th percentile by nan
pct = @(v,p) interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))', sort(v), p*0.01, 'spline');
t.Var1(t.Var1 > pct(t.Var1, 99)) = nan;
t.Var2(t.Var2 > pct(t.Var2, 99)) = nan;
The problem of course is that there are nan values scattered across multiple variables in the table and I therefore receive the following error message: Warning: Columns of data containing NaN values have been ignored during interpolation.
Does anyone has an idea as to how I could replace values above the 99th percentile in a table by nans in a table where there are already multiple nans ? Please note that I do not have the Statistic toolbox.
Accepted Answer
More Answers (1)
per isakson
on 15 Aug 2019
Edited: per isakson
on 15 Aug 2019
Ignore the NaNs explicitely when calculating the percentile. Try
%%
% Table with nan
t = array2table(vertcat(horzcat([1:1000]', [1001:2000]'), NaN(10, 2)));
% Replace values above 99th percentile by nan
t.Var1(t.Var1 > pct(t.Var1, 99)) = nan;
t.Var2(t.Var2 > pct(t.Var2, 99)) = nan;
function z = pct(v,p)
v( isnan(v) ) = [];
z = interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))' ...
, sort(v), p*0.01, 'spline');
end
1 Comment
If you're going to use the one-liner appoximation, you can keep the function handle and explicitly ignore the NaNs from within the call to the function.
t = array2table(vertcat(horzcat((1:1000)', (1001:2000)'), NaN(10, 2)));
pct = @(v,p) interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))', sort(v), p*0.01, 'spline');
t.Var1(t.Var1 > pct(t.Var1(~isnan(t.Var1)), 99)) = nan;
t.Var2(t.Var2 > pct(t.Var2(~isnan(t.Var2)), 99)) = nan;
% {_____here____}
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!