Create multiple subtables from multiple .tsv tables

1 view (last 30 days)
I have 120 .tsv files (see example example in "sub-m0001_file.tsv"). The path is the same for all the files except in the 9th folder. See the paths for the first two .tsv files below:
/f1/f2/f3/f4/f5/f6/f7/f8/sub-m0001/f10/f11/file.tsv
/f1/f2/f3/f4/f5/f6/f7/f8/sub-m0002/f10/f11/file.tsv
How can I get subtables (i.e., 1 table per file) including only the following six columns: 'trans_x', 'trans_y', 'trans_z', 'rot_x', 'rot_y', 'rot_z'?
The following code does it only for the first .tsv file. Any hint to go recursively over the 120 .tsv files?
mat = dir('/f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/file.tsv');
for files_i = 1:length(mat)
data = fullfile(mat(files_i).name);
x = readtable(data,"FileType","text",'Delimiter', '\t');
vars = {'trans_x' 'trans_y' 'trans_z' 'rot_x' 'rot_y' 'rot_z'};
new_x = x(:,vars);
end
Then, I need to store each file in a folder which filename corresponds to sub-m*. for instance (example sub-m0001_subfile.txt) see:
/new/path/sub-m0001/sub-m0001_subfile.txt
/new/path/sub-m0002/sub-m0002_subfile.txt
Many thanks in advance

Accepted Answer

Stephen23
Stephen23 on 29 Jan 2025
Edited: Stephen23 on 30 Jan 2025
"The following code does it only for the first .tsv file. Any hint to go recursively over the 120 .tsv files? "
There is nothing in your code that in any way stores or saves the data from each iteration, so your code iterates through each file, imports the file data, and then discards/overwrites the file data on the next loop iteration. So in the end it might look as if it only imported data from one file. But looking at the value of files_i would tell you how many files it has iterated over.
Solution: either use indexing to allocate the imported data into one array (e.g. a cell array or structure array) or export the data into files on each iteration.
"I need to store each file in a folder which filename corresponds to sub-m*. for instance (example sub-m0001_subfile.txt) "
Then you need to export the table data. For example:
V = {'trans_x','trans_y','trans_z','rot_x','rot_y','rot_z'};
S = dir('/f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/*file.tsv');
for k = 1:numel(S)
% import file data:
F = fullfile(S(k).folder,S(k).name);
T = readtable(F,"FileType","text",'Delimiter', '\t');
% optional: store imported filedata:
S(k).data = T;
% export table data:
G = extractBefore(S(k).name,'_');
H = fullfile('/new/path',G,[G,'_subfile.txt']);
U = T(:,V);
writetable(U,H)
end
The data is all stored in the structure S. You can access this using indexing, e.g. the 2nd file:
S(2).folder % location
S(2).name % filename
S(2).data % imported file data
  6 Comments
Stephen23
Stephen23 on 30 Jan 2025
Edited: Stephen23 on 30 Jan 2025
"Issue 2: "E" does not work properly"
Please show the exact path and code that you used. It works as expected here:
P='./f1/f2/f3/f4/f5/f6/f7/f8/sub-m0001/f10/f11'; mkdir(P); dlmwrite(fullfile(P,'file.tsv'),1)
P='./f1/f2/f3/f4/f5/f6/f7/f8/sub-m0002/f10/f11'; mkdir(P); dlmwrite(fullfile(P,'file.tsv'),2)
S = dir('./f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/file.tsv');
for k = 1:numel(S)
E = regexprep(S(k).folder,{'^.*/f8/','/f10/.*$'},'')
end
E = 'sub-m0001'
E = 'sub-m0002'
Which means that you are doing something different to what you explained or showed, e.g. your folder names are not really f1, f2, etc. Guessing important information like this is much less reliable than it being written down.
In any case, here are alternative approaches that might work for your (duplicated?) folder names:
for k = 1:numel(S)
E = regexprep(S(k).folder,{'^.*/f8/','/.*$'},'')
end
E = 'sub-m0001'
E = 'sub-m0002'
for k = 1:numel(S)
E = regexp(S(k).folder,'sub-m\d+','match','once')
end
E = 'sub-m0001'
E = 'sub-m0002'
julian gaviria
julian gaviria on 30 Jan 2025
Moved: Voss on 30 Jan 2025
issue1: "I get the following error because, in deed, the output (destination) file does not exist, it must be created."
You were right, the problem was the incomplete filename in the anchor expression indicating the end of the input text. E.g.,:
incorrect
E = regexprep(S(k).folder,{'^.*/f8/','/f10/.*$'},'')
correct
E = regexprep(S(k).folder,{'^.*/f8/','/f101a/.*$'},'')
Issue 2: "Issue 2: "E" does not work properly"
thanks a lot for the input. "mkdir" was the solution
H = fullfile('/new/path',E);
mkdir(H)
N = fullfile(H,[G,'.txt']);
U = T(:,V);
writetable(U,N, 'WriteVariableNames',0)

Sign in to comment.

More Answers (0)

Categories

Find more on MATLAB Report Generator in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!