how to read complicated CSV files into matlab?

16 views (last 30 days)
My raw CSV file looks like the 1st pic. And I wants to use Matlab read it into the format as the 2rd pic. I have over 1000 the same kind of CSV files, it will be painful if I do it by copy/paste. How can I do this? Any examples? thanks.
I am a new user here, do not know how to post images, Please click the links to see. Thanks!
-------------------------------- raw data:
ID
Height
1 2 3
4 5 6
Weight
1 2 3
4 5 6
---------------------------
output data:
id height weight
1 1 1
1 2 2
1 3 3
...

Accepted Answer

Geoff
Geoff on 21 May 2012
Easiest is to read your file one line at a time, and separate your sections. If your files follow exactly the format you've listed with no missing values, it's simple... Maybe a bit clunky, but serves its purpose. If there's more than height and weight, I'd combine this into a single loop with section logic, rather than create loops for each.
Try this... might contain errors. I just hacked it up here to illustrate.
fid = fopen('data.csv', 'rt');
header = fgetl(fid);
id = regexp(header, ',(.*),', 'tokens');
id = id{1}{1};
% Read height section
fgetl(fid); fgetl(fid); % skip headers
height = [];
while ~feof(fid)
line = fgetl(fid);
vals = sscanf(line, '%f,%f,%f', 3);
if isempty(vals)
break;
end
height(end+1, 1:3) = vals;
end
% Do the same thing for weight.
fgetl(fid); fgetl(fid); % skip headers
weight = [];
while ~feof(fid)
line = fgetl(fid);
vals = sscanf(line, '%f,%f,%f', 3);
if isempty(vals)
break;
end
weight(end+1, 1:3) = vals;
end
fclose(fid);
[EDIT] To select appropriate action from arbitrary order - the dirty way =)
while ~feof(fid)
header = regexp(fgetl(fid), '^(.*),', 'tokens');
if ~isempty(header)
header = header{1}{1}
end
fgetl(fid); % skip second header line
switch lower(header)
case 'height'
% height-processing loop goes here
case 'weight'
% weight-processing loop goes here
otherwise
error('Unexpected section header');
end
end
There are nicer ways to do all this, but my philosophy is if the quick and dirty option is good enough, there's no need to be clever.
  3 Comments
Ken
Ken on 22 May 2012
Geoff, Just one more thing.
In some of my CSV files, height and weight are in a different order. Sometimes, weights are on top of heights.
How to write the code in that case?
Thanks again!
Geoff
Geoff on 22 May 2012
Yep, in that case, pull the value out the same way I pulled out the ID. See where I have done the two fgetl() calls to "skip headers"? I ignored those results on the assumption that order was consistent. I'll edit my answer accordingly, as putting code in these comments is a bit futile.

Sign in to comment.

More Answers (1)

Matt Kindig
Matt Kindig on 21 May 2012
Can you excerpt the actual text of your CSV file (not the file imported into Excel that you screenshot, but the actual original file with the commas shown and everything)? I ask because I don't think you'll be able to simply use csvread/dlmread, but will need to use a more sophisticated approach using something like fscanf or regexp. Please post an excerpt of your file.
Also, where does the ID come from? Are there other fields marked 'id' later in the file (such as A002)? Note that the raw/output data shown in your posting and in the screenshots is inconsistent--this makes it a bit confusing for me.
  1 Comment
Ken
Ken on 21 May 2012
Matt,
1. I use TEXT to open the csv file and have the following.
id,A001,
height,,
a1,a2,a3
3,4,5
2,4,5
6,7,5
,,
weight,,
a1,a2,a3
4,4,5
5,4,6
6,7,5
2. You are right, the data shown above is for "id A001" and the one next to it is "id A002". One csv file belongs to on ID only.
Thanks for your help!

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!