Clear Filters
Clear Filters

How to glean the files I want from so many data files?

2 views (last 30 days)
Dear MATLAB Experts,
I have had this problem for a couple of days and I could not think of a way to resolve this. I do appreciate any helps to get my way out!
I have about 100000 files which names are in the following format:
sprintf ("%dfile%d.csv" , identifier 1, identifier 2)
where identifier 1 is numbers 1, 2, 3, 4, 5 ..., 50 (in order) and identifier 2 could be any number. Some examples are
1file1234.csv
1file2003.csv
1file11111111.csv
2file6667.csv
2file99999.csv
3file1.csv
3file10000.csv
.
.
.
50file3456.csv
50file123456.csv
The files I am interested in are the ones meet this condition:
for each identifier 1 from (1 to 50), the file with smallest identifier 2 is of interest. All other should be deleted. In the above examples, we need
1file1234.csv
2file6667.csv
3file1.csv
.
.
.
50file3456.csv
How can I write a code in MATLAB does this for me?
Thank you so much in advance.

Accepted Answer

Image Analyst
Image Analyst on 30 Jan 2017
Use dir() on the folder to get the filenames. Then use sscanf() to extract the two numbers. Easy, but let us know if you can't do it.
  5 Comments
Walter Roberson
Walter Roberson on 31 Jan 2017
dinfo = dir(fullfile( '*file*.csv'));
filelist = {dinfo.name};
bestids = inf(1,50);
bestnames = cell(1,50);
for K = 1 : length(filelist)
this_file = filelist{K};
ids = sscanf(this_file, '%dfile%d.csv');
oldbest = bestids(ids(1));
if ids(2) >= oldbest
delete(this_file);
else
oldbestname = bestnames{ids(1)};
if ~isempty(oldbestname)
delete(oldbestname);
end
bestids(ids(1)) = ids(2);
bestnames{ids(1)} = this_file;
end
end
At each point, if the id2 is not the best we have seen for this id1 then delete the current file; if it is better than our previous best then delete the previous best and keep the one we just encountered.
Homayoon
Homayoon on 1 Feb 2017
Thank you so much Walter. I now understand the code. Thanks.

Sign in to comment.

More Answers (0)

Categories

Find more on File Operations in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!