Readtable function has some kind of limitation on the number of rows and columns?

37 views (last 30 days)
Hello i am using the function readtable to read a table from excel that has ~1.000.000 rows and 42 columns but it doesnt work, matlab has some kind of limitation on the number of rows and columns?
Any way that i can load my data?
Thanks Tiago
A = readtable('test.xlsx);
x = A{:,1}; % 1st column is data time
y = A{:,2:end};
Even with 10 rows and 40 columns, when i execute the readtable, it creates me a table 0x0 no idea why.
A = readtable('u3300.xls');

Accepted Answer

Guillaume
Guillaume on 17 Jan 2018
Stepping through the xlsread code and looking at the content of the excel file in excel VBA editor reveals the reason why you get an empty table:
The excel file has actually two worksheets. Sheet 1, the one that matlab reads by default, has the attribute xlSheetVeryHidden meaning it can't be unhidden through the standard excel hide/unhide interface, only through VBA. Note: that very hidden sheet is called AspenPDXLAddin, so presumably is a left over of an Aspen process add-in.
You can blame who you want here. Maybe matlab shouldn't try to read hidden or very hidden worksheets, maybe Aspen shouldn't hide their worksheets, or maybe excel shouldn't allow worksheets to be very hidden. Anyway, the workaround is simple, specify the worksheet to read:
readtable('U-3300 Process.xlsx', 'Sheet', 'U-3300 Process')
However, you then have the problem that excel reports that the used range of the 'U-300 process' worksheet is '$1:$969121', so matlab tries to use that range and ends up trying to allocating a 969121 x 16384 array which would require >127 GB of memory. So of course, out of memory error. Again, you can share the blame as you want. Maybe matlab shouldn't use the UsedRange property of the worksheet as it's known to be unreliabl, but then excel should provide a reliable way of getting the used range. Or maybe Aspen shouldn't have filled the sheet in such a way that the used range is calculated incorrectly. The reason the used range is reported as such is because column XFD of the spreadsheet has a custom width. (You can quickly see the used range in excel by pressing ctrl+end).
In the end, you need to both provide the sheet and the range for matlab to read that file correctly:
readtable('U-3300 Process.xlsx', 'Sheet', 'U-3300 Process', 'Range', 'A1:AP969121')
Note that it takes a while because excel takes a long time to open the file. Then readtable spend a long time finding out the type of the columns. If possible, I would recommend a different format than excel to get the data out of the Aspen software. Otherwise, you may be better off using xlsread, then converting to table.
  10 Comments
Guillaume
Guillaume on 18 Jan 2018
Note: if memory is really an issue, it would be possible to write an xlsread alternative that would use less memory (at the expense of time) by reading chunks of the spreadsheet then immediately converting them into a matrix rather than the method used now that reads everything into a cell array (requiring 4 GB) then converting the whole cell array into a matrix (which only requires 400 MB) at the end.
Tiago Dias
Tiago Dias on 18 Jan 2018
And how to do that? because yes memory is a problem, because after loading all the data and making the plots for each variable, i end up with 500 mb of memory to to things that i need to do like cleaning out points and stuff.
I will still try to use the xlsread, because i jsut tried it the readtable on my laptop it works but i have 500 mb of memory available for the rest

Sign in to comment.

More Answers (2)

Guillaume
Guillaume on 16 Jan 2018
"but it doesnt work" is a useless statement on its own. In what way it doesnt work. You get an error? (if so, which one?), you get an incorrect result (in what way?), your computer explodes?
Matlab does not have any limitation on the size of a table other than your computer memory. A million row and 42 columns of numbers is certainly not a limitation, it only requires ~336 MB.
  7 Comments
Tiago Dias
Tiago Dias on 17 Jan 2018
thats not a problem, readtable reads whatever is on excel, and i can just load the first three columns, the fourth acuses low memory and it aborts there

Sign in to comment.


Peter Perkins
Peter Perkins on 16 Jan 2018
You have not provided any sample of what's in your spreadsheet, so pretty hard to say what's wrong. Best guess is you have a lot of text, which is causing memory issues. In recent versions of MATLAB, you can use the 'TextType' and 'DatetimeType' to use less memory in the imported file.
  3 Comments
Tiago Dias
Tiago Dias on 17 Jan 2018
even with this excel sheet it creates me a table 0x0 no idea why, becuase my first column is time, and i choose in the excel "date" and the numbers as numbers
Walter Roberson
Walter Roberson on 17 Jan 2018
Your u3300.xls file is corrupt according to a couple of different utilities, but Excel itself is willing to read it. You could try looking at https://blogs.msdn.microsoft.com/officeinteroperability/2011/07/12/microsoft-office-binary-file-format-validator-is-now-available/ for a tool that might be able to tell you what the problems are with the file.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!