table block growth slow

Hi,
I am trying to fill up a table with mixed data containing cashflows. each mortgage gets multiple lines. Preallocation of a similair table does not seem solve this problem. For preallocation i used repmat to create a table as large as the output containing dummy vars. Because dynamic allocation is complexer to read and gave no performance boost I decided to drop this approach.
The process of dynamically building the table starts out fairly fast but after 35000 iterations, where each adds blocks of about 360 new records, the perfomance decreases very fast. Takes about an hour or more.
enddataset = [dataset;newblock] % dynamic, slows over time
the tictoc slows down. now the weird stuff it that when I use struct it remains fast (11/12 minutes):
myStruct.iterationLabel = newblock. % does not slow over time
As I said using repmat and preallocation does not speed up.
Is it possible to grow a table in a fast manner?

 Accepted Answer

Matt J
Matt J on 22 Jun 2018
Edited: Matt J on 22 Jun 2018

0 votes

If growing the object in struct form is fast enough for you, then a solution would be to build it as a struct first and then transform the final result using struct2table().

6 Comments

Alex B
Alex B on 22 Jun 2018
Edited: Alex B on 22 Jun 2018
Hi Matt, thanks for answering, this is a good idea but it does not work since "fields in a scalar structure must have the same number of rows".
I do not recognize that quotation... You mean myStruct.iterationLabel does not contain the full data for the table column that it is to become?
Alex B
Alex B on 22 Jun 2018
Edited: Alex B on 22 Jun 2018
in the following line newblock is a table containing different numbers of rows. Each block represents one cashflow schedule for a mortgage.
myStruct.iterationLabel = newblock.
when I then Apply
struct2table(myStruct)
I get the following error:
Error using table.fromScalarStruct (line 318)
Fields in a scalar structure must
have the same number of rows.
Error in struct2table (line 65)
t = table.fromScalarStruct(s);
So what I then did is make it a cell
test = struct2cell(myStruct);
Then I tried dynamic growth, which is terribly slow:
% first method dynamic allocation
out=[];
for i = 1:size(test,2)
out=[out;test{i}]
end
Then I tried preallocation (at least what I think is preallocation) and it is slow again:
% second method preallocated copy per struct
accum = 0;
for i = 1:size(test,1)
accum = accum + size(test{i},1)
end
prealloc = repmat(test{1}(1,:),accum,1);
counter = 1;
for i = 1:size(test,1)
curr = test{i};
[nrec,cols] = size(curr);
prealloc(counter:(counter+nrec-1),:) = curr;
counter = counter+nrec;
end
Now this can be caused by the conversion to cells but I think this is not the case and it is just very slow. Reason I insist on using table is because I want to store strings. I will now revert to using only numeric data and try to append that to see whether that is faster. Will let the result know.
The strange thing here is that the dynamic appending slows down when more is added ... this is strange as it does not seems to be a memory allocation problem but an intrinsic programming problem; when the set that has to be appended is larger it takes more time to append ... weird as I can imagine this is only a memory pointer somewhere so I do not understand why it is slowing down, I suspect it is possible to do this faster.
[edit] I just eliminated the strings and I am using a numeric matrix. Still slow.
Matt J
Matt J on 22 Jun 2018
Edited: Matt J on 22 Jun 2018
The strange thing here is that the dynamic appending slows down when more is added
That is normal. Matlab has to allocate a new and larger memory block for the entire object (to contain both the new stuff and the old stuff) every time you append - you are doing bigger and bigger memory allocations with each iteration.
I am suspicious that your original table pre-allocation did not give you any benefit. Perhaps you should show the code you used to implement that.
Hi Matt,
Your suspicion proved right. I did some reading and indeed for any dynamic growth Matlab copies the whole object. Instead of a table containing strings I coded the strings into numeric and now use a numeric matrix. My preallocation for the table or something else must not have been working since for the numeric matrix it did work. On top of that the computational time went down even further (50%) So I am keeping this.
Solution thanks to your help; use preallocation and use numeric matrix instead of table object with strings.
Thanks Matt!
I'm coming late to this thread, but two suggestions:
1) If you know the total size of the final table, you are much better off preallocating the right number of rows with zeros or whatever, and assigning into those rows. In R2018a, there's a new table constructor syntax for preallocation.
2) If that's not possible, you can save each block in a cell array of 360-row tables, and then vertcat(c{:}) all those tables into one at the end. This is usually quite fast.

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

on 22 Jun 2018

Commented:

on 3 Jul 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!