Why Fread a 2 GB file needs more than 8 GB of Ram?
4 views (last 30 days)
Show older comments
textscan is too slow.
Thus, I want to load a 2 GB file in RAM with fread (fast), then scan it.
Fread works well with small files, but if I try to fread(filename,'*char') a 2 GB file, RAM spikes for some reason over my 8 GB limit and I get out of memory.
Ideas?
2 Comments
Answers (3)
Jan
on 4 Jun 2013
Reading a 2GB-file into a CHAR required 4GB of RAM, because Matlab uses 2-byte-chars. Then it is possible depending on the way you store the data, that the contents of a temporary array is copied, such that 8GB is the expected memory consumption. But actually I'd expect that this copy could be avoided, so it might be helpful, if you show us the code fragment.
2 Comments
Jan
on 4 Jun 2013
Edited: Jan
on 4 Jun 2013
I've seen an equivalent behavior for another FREAD implementation (not in Matlab): The required final size was not determined by FSEEK, but the file was read in chunks until the buffer was filled. Then the buffer was re-allocated with the double size. After the obvious drawbacks have been mentioned in a discussion, the author decided to replace the doubling method by a smarter Fibonacci sequence. :-)
Iain
on 4 Jun 2013
As Jan implied, passing around variables often leads to memory duplication - 2GB arrays get COPIED when put into functions.
The Out of memory error normally comes up when matlab cannot find a single chunk of RAM big enough for a variable.
Use much smaller chunks of memory, and read the file in and parse it in chunks of, say, 64MB.
2 Comments
Walter Roberson
on 4 Jun 2013
The arrays will only get copied if they are modified; otherwise the data pointer will point to the original storage.
See Also
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!