Using regexpressions with 'dir' function
47 views (last 30 days)
Show older comments
Hi, I have a folder (MainFolder) that contains other subfolders (Subfolder1, Subfolder2, Subfolder3, ....). Each of those subfolders contain different png images. The format of the name of those images is 'somerandomnameOD1.png' or 'somerandomnameOS1.png'. The last number of the names varies from 1 to 6.
I am trying to use dir function with some regexpressions to extract the path of all those images that end with a pattern like 'OD' or 'OS' followed by the number 1, 2 or 6 (OD1, OD2, OD6, OS1, OS2, OS6) from all the subfolders within the 'MainFolder':
a=dir('C:\folder\folder2\Desktop\MainFolder\**/*O(D|S) (1|2|3).png'));
But the result is an empty struct.
I know I can simply use this:
a=dir('C:\folder\folder2\Desktop\MainFolder\**/*.png'));
and then filter that structure for the names that I need.
But I woudl like to know if it is possible to use reg expressions with a 'dir' function.
Thanks.
0 Comments
Accepted Answer
dpb
on 21 Sep 2022
Edited: dpb
on 21 Sep 2022
"I woudl like to know if it is possible to use reg expressions with a 'dir' function."
Nope. Unsupported by the OS; "filename globbing" isn't the same thing as a regular expression.
The '*' and '?' wildcards are supported only.
There are ways by using find or grep but it's simpler to just retrieve with what isolation can be done with the limited wildcards and then apply regexp or other matching tools to that result than to build the commands for the shell.
1 Comment
Walter Roberson
on 21 Sep 2022
Right, the suppport for ? and * and ** is provided by dir() itself. dir() is calling into operating system functions to retrieve the contents of the directory, and those functions do not support wildcards or patterns, they just return a list of what is present in the directory, so dir() is handling the filtering.
Historically there has been differences in whether a filename of * means the same thing as *.* on the different operating systems. Unix globbing (filename processing) says that *.* requires that an actual period be present as part of the name, which is different from the historic DOS implementation of 8+3 filenames -- the 8+3 filenames literally do not store the period so 'ABC' and 'ABC.' were indistinguishable because both were stored internally as ABC followed by 5 nulls for the 8 part, and 3 nulls for the extension part. And historically, directories in that time frame had a DIR extension that was routinely hidden, so DEF.DIR was the formal name for folder DEF but eventually the extension for directories stopped being used.
All of which is to say that the processing of wildcards in names is more complicated than one might expect at first, and you need to know which release and which operating system to figure out the precise details. But it has been well over a decade since unix-style filename globbing characters were paid attention to.
More Answers (0)
See Also
Categories
Find more on File Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!