Importing a .csv to matlab, delimiter problem
Show older comments
I need to import .csv file with data saved as lists. But Matlab does not recognize the comma as a delimiter. I am thinking the problem is because the list has "", but I do not know how to fix it. The same happens in Excel. I am using the matlab built-in import function, to create the script.
%%Initialize variables.
filename = 'C:\Users\Gert Lõhmus\Downloads\ShipData.csv';
delimiter = {',',' '};
%%Format string for each line of text:
% column1: text (%q)
% For more information, see the TEXTSCAN documentation.
formatSpec = '%q%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
% This call is based on the structure of the file used to generate this
% code. If an error occurs for a different file, try regenerating the code
% from the Import Tool.
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Post processing for unimportable data.
% No unimportable data rules were applied during the import, so no post
% processing code is included. To generate code which works for
% unimportable data, select unimportable cells in a file and regenerate the
% script.
%%Create output variable
ShipData = table(dataArray{1:end-1}, 'VariableNames', {'RegisterNumberIMONumberShipNameCallSignTypeserviceSystemShipTyp'});
%%Clear temporary variables
clearvars filename delimiter formatSpec fileID dataArray ans;
5 Comments
Star Strider
on 29 Jan 2016
Upload your file here instead. Use the ‘paperclip’ icon, and complete both the ‘Choose file’ and ‘Attach file’ steps.
Stephen23
on 29 Jan 2016
Please upload your file here, as lots of us do not open external links. Click the paperclip button, then both the Choose file and Attach file buttons.
N/A
on 29 Jan 2016
@Gert Lõhmus: the file has some rather strange features, for example:
- double quotes at the beginning and end of each line. Normally this would indicate text fields that needs to be kept together, but in this case it appears to be a mistake, as the first line includes header strings, indicating that there should actually be multiple columns, not just one (string) column.
- significantly different numbers of commas per line, which seems to be because of erroneous encoding of the string fields without string delimiters (e.g. double quotes).
Judging by the column names and the content of the file it seems whatever tool encoded the file has completely messed up creating a useful CSV file. Fields that should have double quotes to keep them together are missing them entirely, and some fields seem to be missing but this is not indicated by two consecutive delimiter characters.
Here are a few lines of the file, to which I have added space character to indicate which fields seem to be containing the same information. However you will see that other fields are missing completely, and some fields contain commas (without double quotes).
"Register Number,IMO Number, Ship Name,Call Sign, Type & service,...,"
" 12536D, 9180401, CEMSOL, C4WH2, General cargo ship,...,"
" 14968W, 9376036, RUTH, 5BEG2, Container ship,...,"
...
" 961S32, 9063665,MURCIA CARRIER, HAINAN BAOSHA 011 (2011), LOUIS PASTEUR (1996), 3FJZ2, Refrigerated cargo ship,Other cargo ship,...,"
" 11930U, SAND CARRIER 101, Hopper barge,...,"
There is no sensible way to "recreate" missing fields, or identify which ones need to be aligned. The simplest solution would be to find the tool that defined that file and export the data again, this time using a standard CSV file (with double quote string delimiters, etc).
Stephen23
on 9 Feb 2016
Accepted Answer
More Answers (1)
John BG
on 30 Jan 2016
try to save the file as SHipData.xlsx and the following:
filename3='ShipData.xlsx'
[num,txt,raw]=xlsread(filename3)
the data you want is all in the cell txt
John
Categories
Find more on Text Files in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!