Import Tab Delimited Data with Missing Values

Question

0 votes

Hello,

I am looking to import data (txt file) that is separated by tab, but has missing values. The file location is here to see what I mean <ftp://ftp.ncdc.noaa.gov/pub/data/paleo/contributions_by_author/zachos2001/zachos2001.txt>

Ideally I'd like to import this just to make simple x-y graphs, replacing the missing values as NaN; unfortunately, they just left the missing values blank and so Matlab throws in data from one column into another. Someone recommended excel but I'd prefer to avoid this since I am working on a Linux and am using the OpenOffice version of "excel" which I don't like using. Help appreciated.

Thanks

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

sco1 on 1 Nov 2011

0 votes

EDIT: Never mind what I had originally posted, I didn't look closely enough at what MATLAB gave me, I'll see if I can figure out a workaround.

EDIT2: I know you don't like using excel, but you can save the data as a text file, read it into excel, and then read the excel file into Matlab (I used fixed width 'delimiting' and deleted the 'Genus' column). Not the most elegant solution, but it seems to work.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 2

Chris on 1 Nov 2011

0 votes

Alright thanks, it's only a problem because I can't use excel on this computer, but I can probably get Matlab access on another mac (which has excel), unless someone knows how to do it more directly.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 3

sco1 on 1 Nov 2011

0 votes

Another less elegant solution would be to loop fgetl(), which will return each line of the file

For example: document = fopen(file); testline = fgetl(document);

If you put the fgetl into a loop, it will read through the entire document line by line and output the entire string of characters. For example, line 1 would be '607 0 3.28 0.88' and line 3 would be '607 0.002 3.16 0.96 3.252 0.72'. However, since there is no constant delimiter in the file you linked, you would need to have some sort of logic for figuring out where there are holes in the data, and I don't think it's a trivial task unless I'm using a methodology that's more complex than is necessary.

It seems like using excel to at least format the data would save you a lot of coding time. Then you can use the time you saved go whack the scientists at NASA in the head and tell them to use comma delimited files like normal people.

1 Comment
Show -1 older comments Hide -1 older comments

Walter Roberson on 1 Nov 2011

It _is_ a trivial task... in FORTRAN.

Sign in to comment.

Answer 4

Walter Roberson on 1 Nov 2011

Open in MATLAB Online

0 votes

When I fetch that file, including by using ftp directly, I do not get any tab characters. Instead, I see columns of fixed width. In order the widths are

3, 19, 16, 11, 11, 11, 11

Unfortunately, textscan() and fscanf() and sscanf() do not support field widths, so I suggest you proceed in a manner similar to this, where LinesToSkip is the number of header lines:

IN = char(textscan(fid, '%s', 'HeaderLines', LinesToSkip, 'Delimiter', '\n')) ;
close(fid);
N{1} = str2double(cellstr(IN(:,1:3)));
N{2} = str2double(cellstr(IN(:,4:22)));
N{3} = IN(:,23:38);
N{4} = str2double(cellstr(IN(:,39:50)));
N{5} = str2double(cellstr(IN(:,51:61)));
N{6} = str2double(cellstr(IN(:,62:72)));
N{7} = str2double(cellstr(IN(:,73:83)));
clear IN

Going via char() and then using cellstr() all over the place is a mechanism to work around the fact that the occasional line is shorted because one or more trailing fields is not present. The char() forces all the lines to be the same size, allowing you to split the text by column numbers. str2double() can then be used, but it will not work on character arrays, only on cell string arrays, so the cellstr() call.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Import Tab Delimited Data with Missing Values

0 Comments
Show -2 older comments Hide -2 older comments

Answers (4)

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Import Tab Delimited Data with Missing Values

0 Comments Show -2 older comments Hide -2 older comments

Answers (4)

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

1 Comment Show -1 older comments Hide -1 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments