Loading a 600 mb file ends up astronomically large
    5 views (last 30 days)
  
       Show older comments
    
Hey all!
I'm curently loading a csv file that has both test and numeric data. The file is 600 mb, but when I load it, my ram goes through the roof! (see attached) Why is this happening? I would presume it would just jump by 600 mb.
Any suggestions at all would be helpful!
Thanks!
Trevor

0 Comments
Answers (1)
  dpb
      
      
 on 22 Feb 2021
        
      Edited: dpb
      
      
 on 23 Feb 2021
  
      Attach the file (or at least a portion of it).
Memory usage is only 1:1 with disk storage for numeric data types stored as stream data--otherwise there is overhead associated with higher-level storage types like cell arrays, struct's, tables, etc., ...
While a .csv file on disk will generally occupy more memory than the numeric values in memory owing to representation as character strings instead of internal storage, even that is not necessarily always true -- consider an integer array -- as a double, each will be 8 bytes, but it would take integers >1E7 to require 8 digits/characters to store in a .csv file -- 7 characters for numbers plus a comma. 
> v=randi(100,10,1);
>> csvwrite('inttest.csv',v)
>> !dir inttest.csv
 Volume in drive C is OS 
 Volume Serial Number is 3260-4552 
 Directory of C:\...\MATLAB\Work 
02/22/2021  02:31 PM                28 inttest.csv 
               1 File(s)             28 bytes 
               0 Dir(s)  807,796,576,256 bytes free 
>> whos v
  Name       Size            Bytes  Class     Attributes
  v         10x1                80  double              
>> 
so this simple case resulted in an  80/28 = 2.86X memory multiplier.
Now look at what happens if turn into a cell array--
>> c=num2cell(v)
c =
  10×1 cell array
    {[64]}
    {[51]}
    {[90]}
    {[32]}
    {[ 8]}
    {[88]}
    {[68]}
    {[78]}
    {[ 7]}
    {[19]}
>> whos c
  Name       Size            Bytes  Class    Attributes
  c         10x1              1200  cell               
>> 
Now it's 1200/28 --> 42.86:1 !!!
You can demonstrate such for all storage classes other than base numeric ones--there is no free lunch!
0 Comments
See Also
Categories
				Find more on Language Fundamentals in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
