MATLAB Answers

0

readtable can't get variable names from csv if different number of columns

Asked by Ryan d'Eon on 7 Mar 2018
Latest activity Answered by Jeremy Hughes on 7 Mar 2018
I have datafiles like this:
header,...,header
data,...,data,
data,...,data,
...
Notice the extra trailing comma on each line of data. If I add a trailing comma to the header line by hand, then run readtable(filename), it all works as expected, with all the data under the header names and an extra column of blanks under the name "ExtraVar1".
I can remove that extra column using opts.ExtraColumnsRule='ignore';.
What I haven't figured out is how to use opts.ExtraColumnsRule (or anything else) to read the header lines and use them without manually adding in the trailing comma.
If I try reading the file as, it gives me the super baffling behaviour of reading all the data under the names "Var1,Var2..." with an "ExtraVar1" at the end. I can remove the blank "ExtraVar1" column again using opts.ExtraColumnsRule='ignore';.
So, it recognizes that there's no header value for the last column, and has logic to call it ExtraVar, but for some reason this breaks the rest of the header-to-variable-name conversion.
It also does this all silently, no warning or error or indication of why it has not named the variables.
Is this all intended behaviour? What have I missed? Is the only way to read these in correctly for me to preprocess the files to fix the trailing commas?

  0 Comments

Sign in to comment.

1 Answer

Answer by Jeremy Hughes on 7 Mar 2018
 Accepted Answer

Try giving a hint.
opts = detectImportOptions(filename,'NumHeaderLines',0);
T = readtable(filename,opts)

  0 Comments

Sign in to comment.