How can I get unique entries and their counts and place back into the table?
    11 views (last 30 days)
  
       Show older comments
    
When running the code  given below I get the error: 
     [uniqueEntries, ~, entryGroupIndices] = unique(x);
                                           ↑
Error: Unsupported use of the '=' operator. To compare values for equality, use '=='. To specify name-value arguments,check that name is a valid identifier with no surrounding quotes.
I think is due to (x) not being defined or non existing. 
% Sample data: create a table
data = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
             {'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
             'VariableNames', {'Fruits', 'Var2'});
% Group the data by 'Fruits' and collect Var2 entries
summaryTable = groupsummary(data, 'Fruits', @(x) {x.Var2}, 'IncludeEmptyGroups', true);
% Create a function to count unique entries and their occurrences
countUniqueEntries = @(x) {
    % Get unique entries and their counts
    [uniqueEntries, ~, entryGroupIndices] = unique(x);
    entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
    % Create a table with unique entries and their counts
    table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
% Apply the function to each group using cellfun
countTables = cellfun(countUniqueEntries, summaryTable.GroupCount, 'UniformOutput', false);
% Create the final result table
resultTable = table(summaryTable.Fruits, countTables, 'VariableNames', {'Fruits', 'Counts'});
% Display the results
disp('Unique Fruits and Their Counts:');
disp(resultTable);
The output should look something like this:
 Fruits    Counts
_______    _______
'apple'    [3x2 table]
'banana'   [2x2 table]
'kiwi'     [1x2 table]
'orange'   [1x2 table]
I would love to get the results without having to loop. 
It would also be helpful If I can sort the counts in the counts Table 'descending'. Thank you for the help.
0 Comments
Answers (2)
  Stephen23
      
      
 on 18 Apr 2025
        
      Edited: Stephen23
      
      
 on 18 Apr 2025
  
      "I think is due to (x) not being defined or non existing. "
No, it is because you invented some syntax when defining the anonymous function here:
countUniqueEntries = @(x) {
    % Get unique entries and their counts
    [uniqueEntries, ~, entryGroupIndices] = unique(x);
    entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
    % Create a table with unique entries and their counts
    table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
Curly braces define a cell array. Inside that cell array you called various functions (which is allowed inside curly braces) and attempted to assign their outputs to variables (which is definitely not allowed inside curly braces). It is not valid syntax to perform assignment inside the cell array operator (nor, for that matter, inside any other operators):
{x=sqrt(2)} % this is invalid syntax
Your attempt to use an anonymous function like that will not work. Write a normal function in an Mfile, then you can make as many variable assignments as you wish.
I doubt that using nested tables like that will make processing your data easier: https://xyproblem.info/
2 Comments
  Stephen23
      
      
 on 18 Apr 2025
				T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
    {'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
    'VariableNames', {'Fruits', 'Var2'})
U = groupsummary(T,'Fruits')
  Walter Roberson
      
      
 on 19 Apr 2025
				To be more explicit:
@(x) { CODE } is not used to define a code block. @(X) { CODE } is used to define a cell array of expressions. The individual expressions must return (possibly empty) values, and must not be assignment statements or control statements.
  dpb
      
      
 on 19 Apr 2025
        
      Edited: dpb
      
      
 on 19 Apr 2025
  
      "...also be helpful If I can sort the counts in the counts Table 'descending'. "
T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
    'VariableNames', {'Fruits'});
T=addvars(T,~matches(T.Fruits,'banana'),'NewVariableNames',{'Round'});
T=convertvars(T,{'Fruits'},'categorical');
U=sortrows(groupsummary(T,'Fruits',@(x)all(x),{'Round'}),'GroupCount','descend');
U=renamevars(U,{'fun1_Round'},{'Round'})  % fixup annoying funN_ prefix that can't stop
% alternative is mung on variable names directly...
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,'fun1_','');
% general alternative, can use a pattern string to automate more than one
%pat='fun'+digitsPattern+'_';
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,pat,'');
Although the actual logic for determing the logic state is unstated, took a guess as why 'banana'  is different...
NOTA BENE that to bring along other variable(s) in the summary, one has to be able to reduce them to one statistic per group; which all does above for the characteristic variable.  As noted, it would be nice if groupsummary also had the option to set 'OutputVariableNames' as does rowfun
0 Comments
See Also
Categories
				Find more on Tables in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


