Test existence of files with EXIST
88 views (last 30 days)
Show older comments
Actually the command exist(FileName, 'file') seems to be sufficient to check the existence of a file. Therefore I used this code to check, if the input of a function is an existing file (thanks to David who has found the bug):
function Hash = DataHash(Data)
...
if exist(Data, 'file') ~= 2
error('File not found: %s.', Data);
end
The help text of exist explains, when the value 2 is replied:
2 if A is an M-file on MATLAB's search path. It also returns 2 when A is
the full pathname to a file or when A is the name of an ordinary file on
MATLAB's search path
But "when A is the full pathname to a file" does not match, when A is a MEX-, MDL- or P-file, because in these cases 3, 4 or 6 is replied respectively. So let's try to improve the check:
if ~any(exist(Data, 'file') == [2, 3, 4, 6])
error('File not found: %s.', Data);
end
But even then, exist() is smarter then expected:
File1 = fullfile(matlabroot, '\toolbox\matlab\graph2d\plot')
File2 = fullfile(matlabroot, '\toolbox\matlab\graph2d\plot.m')
File3 = fullfile(matlabroot, '\toolbox\signal\signal\@dspdata\plot')
File4 = fullfile(matlabroot, '\toolbox\signal\signal\@dspdata\plot.m')
exist(File1, 'file') % 0 !
exist(File2, 'file') % 2
exist(File3, 'file') % 2 !
exist(File4, 'file') % 2
I guess that File1 is not recognized, because plot is a built-in function, while @dspdata\plot (File3) is not a built-in function. But File3 is not an existing file:
fopen(File1, 'r') % -1
fopen(File2, 'r') % 3
fopen(File3, 'r') % -1 !! inspite of: exist(File3, 'file') ~= 0
fopen(File4, 'r') % 4
fclose('all')
So how can we check the existence of a file in a simple and reliable way?
function Ex = FileExist(FileName)
FID = fopen(FileName, 'r');
if FID == -1
Ex = false;
else
Ex = true;
fclose(FID);
end
But there are still exceptions, because even fopen() is smart also:
cd(tempdir);
fopen('plot.m', 'r') % 3, file is *found*!
Here fopen() searches in all folders of the Matlab PATH, but actually it should be searched in the current folder only. This has the side-effect, that fopen(name, 'r') is relatively slow. Another idea:
cd(tempdir);
fopen('plot.m', 'r+') % -1, file is not found
This is faster than the 'r' mode, especially if folders of the PATH are stored on network drives. And requesting write access does restrict the search to the local folder only. But this fails, if the current user does not have write privileges to the file.
The next approach:
function Ex = FileExist(FileName)
dirFile = dir(FileName);
if length(dirFile) == 1
Ex = ~(dirFile.isdir);
else
Ex = false;
end
I could not find a file, where this test fails. It is very slow, if FileName is a folder on a network drive which contain very much files. But this is a rare case such that I prefer this test.
Finally a C-Mex using either GetFileAttributes under Windows or _open or _wopen under Linux/MacOS is faster: 10% for existing files, 90% for missing files. But the handling of the unicode strings is not trivial: 2 bytes per wchar under Windows, 4 bytes per wchar under Linux and MacOS, but under Linux wchar's are not used in common, but utf-8 encoded 1 byte per char strings. See Answers: Matlab string to wchar under Linux. I'm going to publish the Mex functions in the FEX, also a DirExist(), because exist(name, 'dir') has similar problems.
- Did you consider such effects caused by the smartness of exist() in your programs?
- Did a user of your programs run into troubles due to weak tests of file existence, e.g. when the resulting error messages are misleading?
- Do your or your programs profit or suffer from the smartness of exist() and fopen()?
- Do you think the behavior of these function is explained clearly enough in the help and doc text?
- Do you want standard jobs solved reliably by simple commands in Matlab?
NOTE: Usage of the recursive font: I mean that smart is not smart.
0 Comments
Accepted Answer
Malcolm Lidierth
on 4 Nov 2012
Edited: Malcolm Lidierth
on 4 Nov 2012
Easy with Java:
File1 = fullfile(matlabroot,'toolbox','matlab','graph2d','plot');
File2 = fullfile(matlabroot, 'toolbox','matlab','graph2d','plot.m');
File3 = fullfile(matlabroot, 'toolbox','signal','signal','@dspdata','plot');
File4 = fullfile(matlabroot, 'toolbox','signal','signal','@dspdata','plot.m');
file=java.io.File(File1);
file.exists()
file=java.io.File(File2);
file.exists()
file=java.io.File(File3);
file.exists()
file=java.io.File(File4);
file.exists()
ans =
0
ans =
1
ans =
0
ans =
1
5 Comments
Malcolm Lidierth
on 4 Nov 2012
Edited: Malcolm Lidierth
on 4 Nov 2012
@Jan
File.isFile() alone will do returning false if the entry does not exist or is a folder.
There will always be extra overhead with Java as the strings are passed as copies (to the java.lang.String constructor then by reference to File) not pointers (Java 9 may fix that).
More Answers (1)
Daniel Shub
on 5 Nov 2012
I am not sure you are using EXIST how it was intended to be used. The H1 line is: %EXIST Check if variables or functions are defined. The documentation says little about checking if files exist. I agree that the argument names and output values are confusing. I think, however, that EXIST should not be used for checking if a file exists. Determining if a function exists seems harder than determining if a file exists, therefore I wouldn't expect it to compete in terms of speed.
3 Comments
Daniel Shub
on 6 Nov 2012
I agree that it isn't good practice and it is these types of bugs that make me an FOSS supporter. That said, it may not be as bad as you think. From your example it seems that the problem with EXIST is that it can sometimes erroneously say that a file without an extension exists when it actually doesn't. Therefore any function that adds an extension automatically will be okay. In other functions EXIST may be used to throw a nice error message and the function will error later when it tries to read/write to the file. This again is not a huge problem. The problem is for functions that do not append an extension and create a new file (or follows an alternative processing path) when the current file does not exist. I think that that use case might be rare.
See Also
Categories
Find more on File Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!