Trying to create a table with a char array as variable names

36 views (last 30 days)
I'm trying to create a table that looks as follows:
with the whole alphabet and the number of occurences of each letter in the text.
As I have not yet understood how I can create a table with named variables, what I'm trying to do is as follows:
alphabet.letters = ('A':'Z')';
mytexttable = cell2table(cell(numel(texts(:,1)),28), 'VariableNames', {'Text', 'length', alphabet.letters'});
This only works if I change the row size of the table to 3, as this gives me the variables 'Text', 'length' and 'ABCDEF...'.
I tried a couple of other variations, including every combination of swirly, square and normal brackets and row/column arguments, 'A':'Z', char(cellstr(alphabet.letters)'), reshape(alphabet.letters, 1, 26) and a few more that I already forgot, but I only manage to get error messages or three rows with the third containing the whole alphabet.
I'm pretty sure I'm completely overlooking a rather easy solution, but I can't figure it out. Can you folks help me out, please?

Accepted Answer

Cris LaPierre
Cris LaPierre on 15 Nov 2021
Your letters need to be separated. One way to do this is put them in a colum vector, and turn that into a string. Note the other variable names will also have to be strings (double quotes) and you would use square brackets instead of curly braces.
letters=string(('A':'Z')')'
letters = 1×26 string array
"A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
Actually building a table can be simple, but it would be helpful to know what data you already have. If you already have the data, and just want to turn it into a table, you could do this.
texts = ["any number of characters will do";"another text"];
Len = strlength(texts);
counts = randi(5,2,26);
mytexttable = table(texts,Len,counts);
mytexttable = splitvars(mytexttable,'counts','NewVariableNames',letters)
mytexttable = 2×28 table
texts Len A B C D E F G H I J K L M N O P Q R S T U V W X Y Z __________________________________ ___ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ "any number of characters will do" 32 3 1 1 3 5 2 4 3 4 5 1 1 4 1 3 5 2 2 3 1 1 2 2 5 3 3 "another text" 12 2 4 2 1 2 4 4 5 4 5 5 2 3 2 1 1 2 2 4 1 3 2 2 3 1 4
  2 Comments
nope nope
nope nope on 16 Nov 2021
Thank you for this excellent reply! The very first line of code was what I needed, and the rest (especially the strlength command) actually shortened my code at another part.
The data I'm using is actually just some imported text from an excel file, so the examples are the kind of data that is expected. It's the final assignment on a training I'm doing to show we can handle the basics of matlab (minus some formatting and syntax on my part, haha) for which I'm coding some basic text analysis, average letters per word compared to the global average and stuff like that.
Cris LaPierre
Cris LaPierre on 16 Nov 2021
strlength counts spaces, too. If you just want to count letters, look into the count function.
texts = ["any number of characters will do";"another text"];
len = strlength(texts)
len = 2×1
32 12
num = count(texts,lettersPattern(1))
num = 2×1
27 11

Sign in to comment.

More Answers (1)

Adam Danz
Adam Danz on 15 Nov 2021
Edited: Adam Danz on 15 Nov 2021
This demo creates the table you described using random text. histcounts is used to count the number of each letter. The counts are case insensitive.
To learn how the random text was generated, see this answer.
  • texts is a cell array of random text.
  • T is the output table.
% Generate fake text using natural frequencies of
% each letter in English
freq = [ 8.167 1.492 2.782 4.253 12.702 2.228 2.015 6.094 6.966 ...
0.153 0.772 4.025 2.406 6.749 7.507 1.929 0.095 5.987 6.327 ...
9.056 2.758 0.978 2.36 0.15 1.974 0.074]./100;
wordLen = randi(5,20,1);
nchar = randi(15,20,1)+6;
texts = arrayfun(@(n,w){char(randsample([32,'a':'z'],n,true,[.2,freq]))}, nchar, wordLen);
% Count number of characters in each text (spaces included)
length = cellfun(@numel, texts);
% Create preallocated table T
letterCounts = zeros(numel(texts),26);
bins = 97:123; % a - z (lower case) + 1 to cover last bin
letterNames = compose('%s',bins')';
tempTbl = array2table(letterCounts,'VariableNames', letterNames(1:end-1));
Text = string(texts);
T = [table(Text,length), tempTbl];
% Loop through each text, count each letter, populate table
for i = 1:numel(texts)
cnt = histcounts(double(lower(texts{i})),bins);
T{i,3:end} = cnt; % assumes 'a' starts in col 3
end
% Display results
T
T = 20×28 table
Text length a b c d e f g h i j k l m n o p q r s t u v w x y z ______________________ ______ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ " nt matri ihe oamc i" 20 2 0 1 0 1 0 0 1 3 0 0 0 2 1 1 0 0 1 0 2 0 0 0 0 0 0 " ugfn " 7 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 " ryiosd" 7 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 " hewhaa yw " 12 2 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 1 0 "uswleen " 10 0 0 0 0 2 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 " ew dd" 7 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 "eiulontfaoeh " 13 1 0 0 0 2 1 0 1 1 0 0 1 0 1 2 0 0 0 0 1 1 0 0 0 0 0 "roniyruexint hahn " 20 1 0 0 0 1 0 0 2 2 0 0 0 0 3 1 0 0 2 0 1 1 0 0 1 1 0 "eonnn iilthti oncu" 18 0 0 1 0 1 0 0 1 3 0 0 1 0 4 2 0 0 0 0 2 1 0 0 0 0 0 "nptci d d ds" 15 0 0 1 3 0 0 0 0 1 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 "esilrhnkg" 9 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 "toeepns" 7 0 0 0 0 2 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 "eoatyypitu " 11 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 2 1 0 0 0 2 0 "ooienoba" 8 1 1 0 0 1 0 0 0 1 0 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 0 "k le ouicieaoe" 14 1 0 1 0 3 0 0 0 2 0 1 1 0 0 2 0 0 0 0 0 1 0 0 0 0 0 "nlasi htmou" 11 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0
  1 Comment
nope nope
nope nope on 16 Nov 2021
Thank you for this very detailed reply, I'm amazed! It does cover way more than I needed though, and I'm somewhat in a rush to finish my project (assignment on a training, have to deliver by thursday), so I'll have to pick it apart and analyze what you did and how it works after that I'm afraid =/

Sign in to comment.

Categories

Find more on Tables in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!