Load text file from subfolder

Hello Creative People,
I'm facing problems in loading data from subfolders. There is a main folder called "fiber optics" and this folder contains about 200k subfolders. Each subfolder contains just only one text file. How can I load text file from each subfolder one after another and plot it?

 Accepted Answer

Walter Roberson
Walter Roberson on 9 Sep 2020
Edited: Walter Roberson on 9 Sep 2020
projectdir = 'fiber optics';
dinfo = dir(fullfile(projectdir, '**', '*.txt'));
filenames = fullfile({dinfo.folder}, {dinfo.name});
nfiles = length(filenames);
datum = cell(nfiles,1);
foldernames = cell(nfiles,1);
for K = 1 : nfiles
thisfile = filenames{K};
fullparent = fileparts(thisfile);
[~, foldername] = fileparts(fullparent);
datum{K} = load(thisfile);
foldernames{K} = foldername;
end
Now you have a cell array, datum, and a cell array foldernames, and can plot as appropriate, using the foldernames as labels.
I doubt that you want 200k lines on one plot, so I will not try to guess how you want to do the plotting.

13 Comments

Sohel Rana
Sohel Rana on 9 Sep 2020
Edited: Sohel Rana on 9 Sep 2020
Hi Walter,
Thank you for your quick response. I just copied and pasted your code but it showing the following error.
"Unrecognized function or variable 'name'.
Error in Important (line 3)
filenames = fullfile({dinfo.folder}, {dinfo,name}); "
For simplicity say, my main folder name is fiber optics and it contains two subfolder "FBG1" and "FBG2". Each subfolder has one text file called "fp1.txt" and "fp2.txt". How can I modify the code? I'm asking so simple question because I'm a beginner in Matlab.
I fixed a typing mistake; I had a comma where I should have had a period.
If the file name is unique over all of the folders, then you probably prefer to record the file names instead of the folder names. In that case it would be
projectdir = 'fiber optics';
dinfo = dir(fullfile(projectdir, '**', '*.txt'));
filenames = fullfile({dinfo.folder}, {dinfo.name});
nfiles = length(filenames);
datum = cell(nfiles,1);
labels = cell(nfiles,1);
for K = 1 : nfiles
thisfile = filenames{K};
[~, label] = fileparts(thisfile);
datum{K} = load(thisfile);
labels{K} = label;
end
Hi Walter,
Both the filename and folder name are unique in my case. Suppose two subfolders FBG1 and FBG2 where FBG1 has a txt file fp1 and FBG2 has txt file fp2. From the code, how would I know which folder's txt file it is taking? How can I plot from then? Suppose I wanna plot the data from fp1.
[found, idx] = ismember('fp1', labels);
do_some_plotting( datum{idx} )
If you want to record both the folder name and the file name, you can merge the code I posted above, into
projectdir = 'fiber optics';
dinfo = dir(fullfile(projectdir, '**', '*.txt'));
filenames = fullfile({dinfo.folder}, {dinfo.name});
nfiles = length(filenames);
datum = cell(nfiles,1);
labels = cell(nfiles,1);
foldernames = cell(nfiles,1);
for K = 1 : nfiles
thisfile = filenames{K};
[fullparent, label] = fileparts(thisfile);
[~, foldername] = fileparts(fullparent);
datum{K} = load(thisfile);
labels{K} = label;
foldernames{K} = foldername;
end
Hi Walter,
It really worked for me. Thank you so much. Could you please help me a little bit? How can I remove the first row of each text file? Actually, each text file conatins some characters instead of numbers in the first row/line.
Replace
datum{K} = load(thisfile);
with
fid = fopen(thisfile);
datum{K} = cell2mat( textscan(fid, '', 'headerlines', 1) );
fclose(fid)
Or if you have R2019a or later,
datum{K} = readmatrix(thisfile, 'headerlines', 1, 'readvariable', false);
Suberb! I really appreciate your help. You have been very helpful.
Hi Walter,
I'm facing again a few problems. When I run the code, every time it's showing the following error.
"Too many open files. Close files to prevent MATLAB instability.
Caused by:
Message Catalog MATLAB: File IO was not loaded from the file. Please check file location, format or contents."
The total subfolder was 50k. Then I reduced the file to 10k. However, still showing the same problem. I also ran the code for subfolder number 100. But same problem. It's running when the subfolder is less than 50. Could you please help me to resolve the problem.
Somehow you are missing fclose for each fopen
Hi Walter,
Can I get the title or legend in each figure accoring to their filename? Suppose the the text file name is "fp1" and when I will plot it, it will show the txt file name as a title/legend.
Yes, for the K'th plot you can use labels{K}
title(labels{K})
or
plot(datum{K}, 'DisplayName', labels{K})
and later
label show
I'm not directly plotting datum{K}. Actually, each txt file has three columns and I will need to plot column 1 and 3. Column 1 is the x-axis value whereas 3 is the y-axis value. I used your recent code but did not get any title in the plot. Datum is a cell and for each cell I will get the graph. For example, when I used the follwoing code:
plot(datum{1,1}(:,1), datum{1,1}(:,3))
I got a graph for the first cell of datum. However, how it will automatically show the filename as title/legend for this graph?
title(labels{K})
to have it show up in the title.
Use labels{K} as the 'DisplayName' option on whatever plot() you do use, if instead you want the name to show up when you
legend show

Sign in to comment.

More Answers (1)

BaseDir = 'BaseDir'; % Change string to base folder
tmp = dir(BaseDir); % Obtains list of files and folders in BaseDir
SubDir = {tmp.name}; % Extract list of names
SubDir = SubDir([tmp.isdir]); % Select entries that are directories
SubDir(1:2) = []; % Ignore first two entries ('.' and '..')
Data = cell(size(SubDir)); % Preallocate data cell array
% Loop through subfolders and perform custom load function, saving output to cell
for n=1:length(SubDir)
FullDir = fullfile(BaseDir, SubDir); % Create full folder name
fnames = dir(fullfile(FullDir, '*.txt')); % Obtain filename (assuming 'txt' extension)
Data{n} = LOAD_FUNCTION(fullfile(FullDir, fname(1).name)); % Either change LOAD_FUNCTION to built in text read function or your custom function that simply takes the filename as input
end
The concept is to first obtain the list of subfolders, then to loop through each subfolder and find the filename before loading it.
If the filename is the same and known in advance, you can simply use that.
If there are multiple files in the folder, you can loop through each file.
If the output is always the same size and format, you can output to a pre-allocated array rather than a cell array - I've assumed the worst-case scenario of different outputs from each file.
You need to change the BaseDir folder and the LOAD_FUNCTION function name as appropriate.

Categories

Find more on Data Import and Analysis in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!