How can I Run the Code On Multiple data Files (.txt) sequentially from a folder

Hi All,
I would like to write a script that will open a group (10 or 20 files at once w.r.t time sequence) of .txt files from a folder (contain some thousands of data files). So far the following script sound good but each time I need to select the 10 or 20 files manually. Is there any way to read the 10 or 20 files automatically and in this way I can read all the files inside the folder? I think there should be a smart way in matlab, Can anybody please help me to get me out from here?
Thanks in advance
[file_list, path_]=uigetfile('*.txt','Grab the file first','files Want to Select automatic','Multiselect','on');
for i=length(file_list)
original_path =[path, file_list(i)]
end
N.B. I use Matlab 2018 version

 Accepted Answer

per_group = 10;
p = dir('CTS_main-128361*.txt');
nFiles = numel(p);
for base_k = 1:per_group:nFiles
group_end = min(nFiles, base_k+per_group-1);
group_size = group_end - base_k + 1;
A_cat = cell(group_size, 1);
for k = base_k : group_end
A = dlmread(p(k).name);
A_cat{k} = cell2mat(A);
end
A = vertcat(A_cat{:});
%do something with this group of 10
end

15 Comments

Thank you for your support. But I want to analyze those data with respect to GPS time 'CTS_main-1283609000.txt'. How to convert this gps time 1283609000(=1000sec data) to my local time (GMT+6). When I wish to plot per_group data how to convert those per_group GPS time to per_group local time? GPS times like
1283609000
1283619000
1283629000, and so on.
format long g
T = [1283609000; 1283609000; 1283619000; 1283629000]
T = 4×1
1283609000 1283609000 1283619000 1283629000
TT = datetime(T, 'convertfrom', 'posixtime', 'TimeZone', 'UTC+6')
TT = 4×1 datetime array
04-Sep-2010 20:03:20 04-Sep-2010 20:03:20 04-Sep-2010 22:50:00 05-Sep-2010 01:36:40
It suits for the individual file or a group of 10. These GPS time is given to the file names,
(i) is there any possible way to extract those UTC time from the files of per_group (for the huge files, how to format times)
(ii) say I plot the per_group data versus time (10*1000s= ~ 167 minutes & I put ticks on every 500 seconds). I want to see this 167 minutes to UTC time or at least put ticks@10 UTC timestamps .
format long g
T= [1283610000; 1283611000; 1283612000; ........; 1283619000]
I go forward in this way (but it performs total time stamp-ing of the the files in the directory):
myDir = uigetdir; % Get the files from directory
files = dir(fullfile(myDir,'CTS_main-*.txt'));
files = struct2table(files); % Total file info as a Table
s=files{:,1}; % 1st column of the table
timeXtract=regexp(s,'\d+','match');
timeStamp=str2double(cat(1,timeXtract{:}));
desiredTimeStamp = timeStamp(:,1);
utcTimeStamp = datetime(desiredTimeStamp, 'convertfrom', 'posixtime', 'TimeZone', 'UTC+6');
Can you help me to get me the possible way for the above group? Thanks
per_group = 10;
p = dir('CTS_main-*.txt');
nFiles = numel(p);
for base_k = 1:per_group:nFiles
group_end = min(nFiles, base_k+per_group-1);
group_size = group_end - base_k + 1;
A_cat = cell(group_size, 1);
A_timestamp = zeros(group_size,1);
for k = base_k : group_end
thisfile = p(k).name;
timestamp = str2double(regexp(thisfile, '\d+', 'match', 'once'));
A = dlmread(thisfile);
A_cat{k} = cell2mat(A);
A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6');
end
A = vertcat(A_cat{:});
%do something with this group of 10
end
without preallocating the memory for (%utcTimeStamp = zeros(1,group_size);), it works fine. But when I wish to preallocate this "utcTimeStamp = zeros(1,group_size);", an error message comes out i.e. "The following error occurred converting from datetime to double:
Undefined function 'double' for input arguments of type 'datetime'. To
convert from datetimes to numeric, first subtract off a datetime
origin, then convert to numeric using the SECONDS, MINUTES, HOURS,
DAYS, or YEARS functions."- no idea how to fix this issue!
n.b. A_cat{k} = A; % no need to use 'cell2mat' as dlmread as a matrix form
You are right, I am not sure now why I put in the cell2mat
I got it from your code https://www.mathworks.com/matlabcentral/answers/782808-how-can-i-run-the-code-on-multiple-data-files-txt-sequentially-from-a-folder#comment_1417708
I appreciate & thank you very much for your (including Mohammad Sami ) help up to now. I'm amazed with your supports. Ok, if I use
A_timestamp = NaT(group_size,1); then A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6'); bold scripts can't be used- error message, sound's good if replace 'TimeZone','UTC+6' with 'InputFormat','yyyy-MM-dd'.
What was the error message when you tried to use the TimeZone ? Using TimeZone tested out in R2020b .
Thanks for your curious mind.
'Cannot combine or compare a datetime array with a time zone with one without a time
zone.'--it shows when I use "A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6');"
However, no issues if I use these lines "A_timestamp = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6')"; AND
"A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'InputFormat','yyyy-MM-dd')"-- but here the problem is it shows NaT rather than time (following)
NaT
NaT
NaT
NaT
NaT
NaT
NaT
NaT
04-Sep-2010 16:33:20
04-Sep-2010 16:50:00
But, I don't want to see this ''NaT... How to see the time values like last 2 rows? can you help to fix this problem?
N.B: I use matlab 2018b
Ah, this is getting obscure ;-)
A_timestamp = NaT(group_size, 1, 'TimeZone', 'UTC+6');
That should fix the "'Cannot combine or compare a datetime array with a time zone with one without a time"
The 8 leading NaT suggest that the first 8 entries do not have valid POSIX timetamps. If you try
per_group = 10;
p = dir('CTS_main-*.txt');
nFiles = numel(p);
for base_k = 1:per_group:nFiles
group_end = min(nFiles, base_k+per_group-1);
group_size = group_end - base_k + 1;
A_cat = cell(group_size, 1);
A_timestamp = NaT(group_size, 1, 'TimeZone', 'UTC+6'); %obscure!
for k = base_k : group_end
thisfile = p(k).name;
timestamp = str2double(regexp(thisfile, '\d+', 'match', 'once'))
A = dlmread(thisfile);
A_cat{k} = A;
A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6');
end
A = vertcat(A_cat{:});
%do something with this group of 10
end
then what shows up when it displays the initial timestamps ?
I am speculating at the moment that some of the filenames might not have any digits at all.
  1. In your code, if choose 'per_group=2', then for the (1st) loop A_cat= 2 1; 2 1 while for the other loop A_cat= 3 1; 4 1 (which are ordered cell). I don't understand what the reason is. Which changes made in the code to make the 1st loop output ordered like A_cat= 1 1; & 2 1, the rest cells are ordered
per_group=2; % to fix bug in the script rather than 10,
A_cat= 2 1
2 1;
% while for the next loop A_cat=
3 1;
4 1 % & so on which are ordered
2. 'A_timestamp' it contains the only time format of the current loop, it can't sustain the earlier time formats. The earlier timeformats is in "NaT" format as follows:
NaT (corresponds to cell {2,1})
NaT (corresponds to cell {2,1})
04-Sep-2010 16:33:20 (corresponds to cell {3,1})
04-Sep-2010 16:50:00 (corresponds to cell {4,1})
Can you please tell me the reason?
3. Another issue the timestamp here we use probably is starts calculating from 1970-01-01 midnight, but I need to set reference Jan 06, 1980.
A_timestamp(k) = datetime(timestamp, 'convertfrom', 'posixtime', 'TimeZone','UTC+6');
for this reason my time here (04-Sep-2010 16:50:00) is 10 years behind!! Do I need to convert time from "gps2utc" or you have some magical scripts?
The third issue is solved with the help of other people i.e.
A_timestamp(k) = datetime(1980,1,6,0, 0, timestamp,'TimeZone', 'UTC'); % It works well
I want to see my xlabel as "1 sec mean pulse from 04-Sep-2010 17:50:00 UTC (1283619000=timestamp)". Can you please help how the script look like? num2str doesn't work....for recalling 'timestamp'- it would be different. Could you please help me in this regard?
The error message comes "Error using sprintf
Unable to convert 'datetime' value to 'int64'."
Thanks in advance
title(sprintf('(%d-%d)-th Files Avg. 1s Pulse:2nd Sensor',base_k, base_k+per_group-1));
xlabel(sprintf('1s mean Pulse from UTC time #(%d to %d)',utcTimeStamp(1st of base_k), utcTimeStamp(end of base_k)); % how to insert this "A_timestamp" value in xlabel
%% How to see the xlabel as "1s mean Pulse from UTC time # (04-Sep-2010 20:20:00 to 04-Sep-2010 22:50:00)"
xlabel(sprintf('1s mean Pulse from UTC time #(%d to %d)', posixtime(utcTimeStamp(1st of base_k)), posixtime(utcTimeStamp(end of base_k)));

Sign in to comment.

More Answers (1)

From your question I assume that you want to essentially read all the files, not just 10-20 files.
You can essentially use uigetdir to get the folder that you want to load the files from.
Then use the dir function to list all the txt files inside the selected folder.
p = uigetdir; % updated var from path to p
files = dir(fullfile(p,'*.txt'));
data = cell(length(files),1);
for i = 1:length(files)
filepath = fullfile(files(i).folder,files(i).name);
data{i} = readtable(filepath); % assuming your data can be read with readtable.
end
% do something with your data.

5 Comments

Note that using path as a variable name shadows the important inbuilt path function.
Hi Sami,
Thanks for your support. However, your code does not provide me what I'm looking for. To select all the files inside a folder I use the following code :
close all; clc; tic
p=dir('*.txt'); % Call the directory
nFiles=numel(p);
A_cat=cell(nFiles,1);
for k=1:nFiles
A=dlmread(p(k).name); disp(size(A_cat))
A_cat{k} = A;
end
A = vertcat(A_cat{:}); toc
But I don't call all the files (in a directory) at a time. Say, I call/read 1st 20 data files and run matlab codes, then I call next 20 .txt files, then next 20.....and so on......
So, I'm looking for matlab command
i) to select 1st 20 .txt files (with respect to 'TIME') with matlab command (with uigetfile or uigetdir a dialogbox come out and I need to select the files/dir---I don't want to select the files in this way)
ii) Later, how do I run a for loop to complete the 'read'ing of all files in a directory?
Thank you very much
p=dir('CTS_main-128361*.txt');
nFiles=numel(p);
A_cat=cell(nFiles,1);
for k=1:nFiles
A=dlmread(p(k).name);
%A=textscan(p(k).name, '%f %f')
A_cat{k} = cell2mat(A); disp(size(A_cat))
end
A = vertcat(A_cat{:});
with this code I can call 10 files (e.g. ...61000, ....619000........ ), but how can I call the next 10 files, and so on?
p = uigetdir; % changed var from path to p
% based on your comments assuming your file names are like
% CTS_main-1234567890.txt
files = dir(fullfile(p,'CTS_main-*.txt'));
files = struct2table(files);
files.id = str2double(extractBetween(files.name,'CTS_main-','.txt');
files.g = floor(files.id / 10000); % create file grouping. adjust groups here
ug = unique(g);
for j = 1:length(ug)
% load one group of data
k = files.g == ug(j);
filesg = files(k,:);
data = cell(length(filesg),1);
for i = 1:length(filesg)
filepath = fullfile(filesg.folder{i},filesg.name{i});
data{i} = readtable(filepath); % assuming your data can be read with readtable.
end
data = vertcat(data{:});
% do something with the current group of data
end

Sign in to comment.

Categories

Find more on MATLAB in Help Center and File Exchange

Products

Release

R2018b

Asked:

SA
on 24 Mar 2021

Edited:

SA
on 2 Apr 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!