Split cvs on commas but prevent doing so for a string with a comma in it

30 views (last 30 days)
My Excel csv file looks like this:
Data,test,04-12-2020 13:11,0,"8,2",1,2,3
Currently I use the following code to seperate the columns:
[~,~,dataCGM] = xlsread('file.csv');
outCGM = regexp(dataCGM, ',', 'split');
outCGM = outCGM(2:end-1);
This does split the columns on commas but also does so for the string "8,2" which is not what I want. Does anyone know how to prevent this issue and keep the value as a string in a single column?

Answers (2)

Cris LaPierre
Cris LaPierre on 13 Dec 2020
Perhaps one of the options given here is helpful.
  20 Comments
Stephen23
Stephen23 on 14 Dec 2020
Edited: Stephen23 on 14 Dec 2020
"Maybe someone like @Stephen Cobeldick, who is a regexp ninja, can improve on this."
Thank you for the unique commendation.
Although it is probably not the fastest approach, I would try importing the entire file as one string, apply some string manipulation to it to remove the line-end quotation marks (e.g. REGEXPREP), and then write a new file which can then be directly imported using READTABLE. That has the benefit of importing all the different data classes correctly without much overhead and all of the standard READTABLE options.
It is not trivial because of course valid quotes around a string should not be removed.
This issue pops up enough to indicate that it would be nice for it to be handled natively:
Perhaps it would be a useful addition for READTABLE et al to include an option named e.g. LINEQUOTE which can be set to the required character (by default empty).
Cris LaPierre
Cris LaPierre on 14 Dec 2020
I can only make it work for what I see.
You can look into what settings are available from detectImportOptions. I suspect the NumHeaderLines is what you are looking for.

Sign in to comment.


Walter Roberson
Walter Roberson on 13 Dec 2020
readtable() with a format that is
'%s,%s,%{dd-MM-uuuu HH:mm}D,%f,%q,%f,%f,%f'
  2 Comments
Tycho Maas
Tycho Maas on 13 Dec 2020
Thanks but the code needs to work on itself without predefining what will be in which column.
Image Analyst
Image Analyst on 13 Dec 2020
That makes no sense. A program will not "work on itself". You need to tell your code HOW to process the file. It won't magically figure it out. Attach your csv file if you need more help.

Sign in to comment.

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!