Dynamically calling variables (it's not what this sounds like!)

Hi everyone,
I have five files that have almost the same name except for one term that is added in the end, like "filename_A", "filename_B" etc., and the columns in these files are almost the same up to that last term as well, e.g.
filename_A: column1_A column2_A column3_A ...
filename_B: column1_B column2_B column3_B ...
I would like to loop through the different files and the different columns, but I cannot figure out how to dynamically bring up the names in the loop with sprintf or similar functions. I know people strongly advise against dynamically creating variables. But is that also the case for calling them?
I suppose an alternative would be to have created my files/named the columns in a better way, but I don't know how. Any help is greatly appreciated because I run into this problem all the time with the way my data is structured!

5 Comments

hello
you can do a loop - like example below and then store the results as structures - which is better coding practice
here we loop excel filters but of course adapt it to your data file extension
so the first code is a demo about looping in the directory
fileDir = pwd;
fileNames = dir(fullfile(fileDir,'data*.xlsx')); % get list of data files in directory
fileNames_sorted = natsortfiles({fileNames.name}); % sort file names into natural order
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
M= length (fileNames_sorted);
out_data = [];
for f = 1:M
% option # 1 for numeric data only using importdata
raw = importdata( fullfile(fileDir, fileNames_sorted{f}));
% create structure
S{f}.filename = fileNames_sorted{f};
S{f}.data = raw; % or column1_A / column2_A etc...
end
the second code is a description about different ways of assigning data to variables
combining both codes should give you the answer to your problem
a=readcell('test.txt',"Delimiter",":");
% option 1 : create a structure :
for ci = 1:size(a,1)
Varnames{ci} = matlab.lang.makeValidName(a{ci,1});
myStruct.(Varnames{ci}) = a{ci,2};
end
% option 2 using assignin (in function variableCreator) :
for ci = 1:size(a,1)
% change blanks in variable names to underscore (otherwise
% variableCreator will throw an error mesage
str = strrep(a{ci,1}, ' ', '_');
val = a{ci,2};
if ischar(val)
disp(['error : non numeric data in line : ' int2str(ci)]);
val = NaN;
end
variableCreator ( str, val )
end
clear str val a ci
% option 3 creating a table - one line engine !
T = array2table(a)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function variableCreator ( newVar, variable )
assignin ( 'caller', newVar, variable );
end
@Mathieu NOE: note that NATSORTFILES accepts the DIR structure directly, so you can simplify the code to:
S = dir(fullfile(..));
S = natsortfiles(S); % no need to extract the name field here!
And then within the loop simply use the same structure to save the imported file data, rather than creating a new cell array:
for k = ..
S(k).data = ..
end
thank you for reminding me this point . Great function !!
all the best
Thank you so much, this works too! I could only accept one answer so I picked Dave B's because it works best with the rest of my script (which I did not show here), but I saved your solution for future use!
yes sure - there are many ways to tackle each specific problem and it's completely normal that you chose the one that matches best your application
... myself I learn a lot from other's answers and that's the positive aspect of this forum. There's always something new to (re) discover and I do also keep interesting answers in my PC for future use
all the best

Sign in to comment.

 Accepted Answer

I think, if I read your question correctly, you're saying:
  • You have files with a suffix
  • The suffix determines the names of variables in a table?
  • You want to do something where you can use the filename's suffix to refer to the table variable names
If that's the case, I suspect the tricky bit is the syntax for referring to a table variable using a name. (Note the parantheses below, the general syntax is table.(dynamicname), which also works for stucts)
t = table(zeros(5,1),ones(5,1),'VariableNames',["Var_A" "Var_B"]);
suffixes=["_A" "_B"];
for i = 1:numel(suffixes)
disp(t.("Var" + suffixes(i)))
end
But if I missed something, it would certainly be helpful if you included some example data files or a reduced example to show what you're trying to accomplish.
(And yes I think most folks would agree it's be better to not create the files this way, but it's hard to say how you should change that code because you didn't give much info, maybe that's a separate question?)

5 Comments

That's correct! Maybe yes it helps if I provide more insight: Here's what one of the five tables looks like and as you can see, the name "Dahl" comes up in almost all columns. The other tables have a different name there but the rest of the column names is the same:
Here's a part of the code, C is the name of the table shown above:
for z_select = min(C.voxel_coord_z_Dahl):max(C.voxel_coord_z_Dahl) % specify z coordinate in question
k = C.voxel_coord_z_Dahl == z_select; % find indexes of rows with this z coordinate
Age_select = C.Age(k); % find corresponding Age value in that row
MT_select = C.Dahl_MT(k); % find corresponding MT value in that row
correl = corrcoef(Age_select,MT_select); % calculate correlation of age and MT
correl_table.MT(correl_table.z == z_select) = correl(1,2); % copy correlation coefficient into new table for plot
end
...what I'm trying to do is to find the rows in that table that contain a particular value (= particular coordinate) and then I want to pick the corresponding values from other columns for that row. And then several other operations, e.g. calculate the correlation of two values from that row, plot them etc. I want to do this for 5 different tables, in all of them "Dahl" is replaced by a different name but everything else stays.
The problem is that the name which is changing 5 times (here "Dahl") is mixed with other strings; if the columns were just named "Dahl" I could have used indexing differently... Maybe someone has an idea how to name or restructure the columns to avoid this
Table arrays allow you to index into variables with names, but they also allow you to index into variables with numeric indices if that's more convenient. Let's look at a slightly more complicated table:
t = table(zeros(5,1),NaN(5, 1), ones(5,1),'VariableNames',["Var_A", "X", "Var_B"])
t = 5×3 table
Var_A X Var_B _____ ___ _____ 0 NaN 1 0 NaN 1 0 NaN 1 0 NaN 1 0 NaN 1
We can get and work with the list of variable names from the table.
variableNames = t.Properties.VariableNames
variableNames = 1×3 cell array
{'Var_A'} {'X'} {'Var_B'}
startsWithVar = find(startsWith(variableNames, "Var"))
startsWithVar = 1×2
1 3
Now we can use the variable names or the variable indices to extract data from the table t.
for k = startsWithVar
thename = variableNames{k};
% Index using the variable number k
fprintf("The first row of variable %s contains %d.\n", ...
thename, t{1, k})
% or index using the variable's name thename
fprintf("The first row of variable %s also contains %d.\n", ...
thename, t{1, thename})
% or index using the variable's name thename dynamically and then an index
fprintf("The first row of variable %s contains %d as well.\n", ...
thename, t.(thename)(1))
end
The first row of variable Var_A contains 0.
The first row of variable Var_A also contains 0.
The first row of variable Var_A contains 0 as well.
The first row of variable Var_B contains 1.
The first row of variable Var_B also contains 1.
The first row of variable Var_B contains 1 as well.
I think now I'm a little more confused, this is what my limited example was getting at above. So I think I'm missing why that does that not resolve the issue:
names = ["Dahl" "ASDF" "Moose" "Goose"]
for i = 1:4
ind = t.("voxel_coord_z_" + var(i)) < 0; % dynamic!
mtval = t.(var(i) + "_MT")(ind);
pdval = t.(var(i) + "_PD")(ind);
c(i) = corr(mtval,pdval)
end
As far as the cleaner strategy...it's not 100% clear to me why you need Dahl in there at all. But a nice structure might be to think of the name "Dahl" as a bit of data. Then you can have all of your data in one table, with a variable (column) that says whether it's Dahl or something else:
name | voxel_coord_z | R2s | MT
Dahl | -10 | .5 | 32
Dahl | -11 | .3 | 22
Vahl | -20 | .7 | 12
...
The advantage here is that instead of having a bunch of separate tables, you can have a single table and refer to separate subsets. Hope that makes sense!
Thank you so much for your help, I did not know at all that you can use something like
mtval = t.(var(i) + "_MT")(ind);
and it solved the issue perfectly! It's very elegant and short as well, thanks! Also the tip about using the name as a bit of data in an extra column is helpful for the next time I create these tables
@Steven Lord Thanks a lot for your answer, I tried it and it works! While I think Dave B's solution is more convenient in my case, I can see a lot of use cases in my script for your solution and I saved it in a script for the future, so thank you very much!

Sign in to comment.

More Answers (0)

Products

Release

R2021a

Asked:

RP
on 9 Nov 2021

Commented:

on 10 Nov 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!