How do I loop over all columns in a cell array?
Show older comments
The following code caluclates the outliers above 3 standard deviations for a single column in balls_data{2,1}(:,1).
data = balls_data{2,1};
mean_val = mean(cat(1, data{:})); % calculate the mean
std_val = std(cat(1, data{:})); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers = cellfun(@(x) x(x > mean_val + threshold | x < mean_val - threshold), data, 'UniformOutput', false); % find the outliers
I now need to run the same code over all columns in balls_data{2,1}.
Could someone help me make this into a loop?
Thanks!
4 Comments
"The following code caluclates the outliers above 3 standard deviations for a single column in balls_data{2,1}(:,1)."
No. The code calculates outliers among all elements of balls_data{2,1}, not just those in the first column.
Anyway, it's easier to work with a numeric array than a cell array containing all scalar numerics.
load('balls_data.mat')
all(cellfun(@isscalar,balls_data{2,1}),'all') % all cells contain a scalar
Original cell array approach, with additional code to get the actual outliers from the outliers cell array:
data = balls_data{2,1};
mean_val = mean(cat(1, data{:})); % calculate the mean
std_val = std(cat(1, data{:})); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers = cellfun(@(x) x(x > mean_val + threshold | x < mean_val - threshold), data, 'UniformOutput', false); % find the outliers
outliers = cat(1,outliers{:});
disp(outliers);
Numeric array (in this case a simple vector) approach:
data = cat(1, balls_data{2,1}{:}); % make a column vector, then everything else is easier
mean_val = mean(data); % calculate the mean
std_val = std(data); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers2 = data(data > mean_val + threshold | data < mean_val - threshold);
disp(outliers2);
isequal(outliers,outliers2)
Now, what exactly are you trying to do? The question specifies balls_data{2,1} but the answers given so far iterate over all elements of balls_data, i.e., balls_data{1,1}, balls_data{2,1}, balls_data{3,1}, ...
You want the outliers for each element of balls_data or what?
Voss
on 17 Mar 2023
Sounds like rmoutliers would work for that.
- If A is a matrix, then rmoutliers detects outliers in each column of A separately and removes the entire row."
data = cell2mat(balls_data{2,1});
data_no_outliers = rmoutliers(data,'mean');
Note that for rmoutliers(_,'mean'), "Outliers are defined as elements more than three standard deviations from the mean." And your code was using two standard deviations.
Answers (3)
load('balls_data.mat')
balls_data is a cell array of cell arrays, except for two cells, which contain empty 0-by-0 numeric matrices
disp(balls_data)
First, convert the cell arrays of scalars in balls_data to matrices, using cell2mat:
balls_data_mat = cellfun(@cell2mat,balls_data,'UniformOutput',false);
balls_data_mat is a cell array of matrices. Note that cell2mat([]) returns [], so the cells that contained the empty 0-by-0 matrices still contain them. But the other cells contain matrices now. So all cells contain matrices now.
disp(balls_data_mat)
Second, remove the rows containing outliers in each matrix:
balls_data_mat_no_outliers = cellfun(@(x)rmoutliers(x,'mean'),balls_data_mat,'UniformOutput',false);
balls_data_mat_no_outliers is a cell array of matrices. Now the matrices each have fewer rows than they did (except the empty ones, which remain empty), because the rows containing any outlier have been removed.
disp(balls_data_mat_no_outliers)
Sulaymon Eshkabilov
on 16 Mar 2023
Here is the complete code to calculate all outliers:
load('balls_data.mat')
% Note balls_data is a cell array contains some empty cells as well. Thus,
% we need this step to remove them:
balls_data_0=balls_data(~cellfun('isempty',balls_data)); % Remove empty cells
for ii=1:length(balls_data_0)
data = balls_data_0{ii,1};
mean_val = mean(cat(1, data{:})); % calculate the mean
std_val = std(cat(1, data{:})); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers{ii,:} = cellfun(@(x) x(x > mean_val + threshold | x < mean_val - threshold), data, 'UniformOutput', false); % find the outliers
end
4 Comments
lil brain
on 16 Mar 2023
Here is how to keep them there:
load('balls_data.mat')
IDX =~cellfun('isempty',balls_data);
for ii=1:length(balls_data)
if IDX(ii)~=0
data = balls_data{ii,1};
mean_val = mean(cat(1, data{:})); % calculate the mean
std_val = std(cat(1, data{:})); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers{ii,:} = cellfun(@(x) x(x > mean_val + threshold | x < mean_val - threshold), data, 'UniformOutput', false); % find the outliers
else
outliers{ii,:} = []; % Will contain an empty cell where balls_data is empty
end
end
size(outliers)
Sulaymon Eshkabilov
on 16 Mar 2023
Edited: Sulaymon Eshkabilov
on 17 Mar 2023
--
lil brain
on 17 Mar 2023
Use this code as given here then you will not get the errors which are shown in your message thread:
load('balls_data.mat')
IDX =~cellfun('isempty',balls_data);
for ii=1:length(balls_data)
if IDX(ii)~=0
data = balls_data{ii,1};
mean_val = mean(cat(1, data{:})); % calculate the mean
std_val = std(cat(1, data{:})); % calculate the standard deviation
threshold = 2*std_val; % set the threshold for outlier detection
outliers{ii,:} = cellfun(@(x) x(x > mean_val + threshold | x < mean_val - threshold), data, 'UniformOutput', false); % find the outliers
else
outliers{ii,:} = []; % Will contain an empty cell where balls_data is empty
end
end
size(outliers)
size(balls_data)
2 Comments
Sulaymon Eshkabilov
on 17 Mar 2023
Edited: Sulaymon Eshkabilov
on 18 Mar 2023
Good luck
lil brain
on 17 Mar 2023
Categories
Find more on Hypothesis Tests in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!