Info

This question is closed. Reopen it to edit or answer.

nominal array memory

2 views (last 30 days)
Mike
Mike on 23 Aug 2011
Closed: MATLAB Answer Bot on 20 Aug 2021
I am having memory issues when I convert large cellular arrays into nominal arrays. I am importing these cellular arrays of strings from outside data sources, so they are necessary at first. I then convert them into nominal arrays as there are only a few unique elements in each array. A simple example to illustrate the point would be:
x = cell(10^7,1);
for i = 1:length(x)
x{i} = 'H6';
end
y = nominal(x);
Initially, matlab is using .53 gigs, after x is created, it's .58 gigs. when y is created, my memory spikes to 2.2 gigs, then after nominal is finished running, matlab is still using 1.5 gigs. if I then type clear x, matlab is using only .632 gigs. This same memory problem happens if try x = nominal(x); instead. I am wondering if the garbage collector isn't deallocating memory properly, or if it is some other issue?
  1 Comment
Oleg Komarov
Oleg Komarov on 23 Aug 2011
your x is ~610 MB.

Answers (1)

Peter Perkins
Peter Perkins on 24 Aug 2011
Mike, I believe this has something to do with the intelligence that's built into cell arrays under the covers to not store multiple copies of the same thing. You've created a somewhat artificial example where there's only one unique string in the cell array, and MATLAB takes advantage of that, at least initially. However, by the time you're done converting to nominal, some temporary copy of that cell array of strings apparently does have a separate copy of each cell's string. I suspect that by clearing x, you are getting back lots of memory not so much because x is cleared, but because the clear operation forces garbage collection.
  1 Comment
Mike
Mike on 24 Aug 2011
Oleg, x is technically not that large, however, as Peter indicates, and what I suspect, is that the garbage collector is not clearing temporary variables during the conversion. I actually wrote a wrapper to convert a cellular array to nominal in 5*10^6 element chunks, and this avoids any issues regarding the memory blowup. I realize my example is artificial, but on my actual data, where I am converting cellular arrays of length >3*10^8 that have about five categories, Matlab's ram usage was spiking to 8 GB, and persisting after creating the class. A call to clear would give back memory, but it would take a while to clear these temporary copies. Additionally, while I have more than 8 GB on my machine, it prevents me from sharing my code to co-workers with less ram. By partitioning the cellular array into chunks, and concatenating the nominal arrays, this issue is avoided. I am happy to share my code on the exchange when I get a chance, but I think this probably should happen in the nominal class as it clearly is an undesirable issue.

This question is closed.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!