Using Textscan on non-uniform data

Question

0 votes

HS_full_18md_nam_outputs.txt

Hello all,

I am currently trying to format outputs from a fortran code into CSV using the textscan function in Matlab. My outputs from the fortran code have a semi-uniform output, but it can change depends on the number of nodes the user requests.

In this example, the user has specified 12 nodes and the text files looks like the following:

Interpolated values at stations

168 12

867600.0000000000 % this is the time step at which the values at the following 12 nodes is found.

1 0.3170054495E+03

2 0.2983347787E+03

3 0.2857833907E+03

4 0.2825696256E+03

5 0.2795692154E+03

6 0.2806315572E+03

7 0.2811630597E+03

8 0.2814156663E+03

9 0.2814718273E+03

10 0.2811785316E+03

11 0.2807765370E+03

12 0.2798665405E+03

871200.0000000000

1 0.3042805523E+03

2 0.3033600277E+03

3 0.2913505094E+03

4 0.2790455081E+03

5 0.2709832029E+03

6 0.2680434294E+03

7 0.2677295494E+03

8 0.2684905990E+03

9 0.2690373464E+03

10 0.2696588011E+03

11 0.2699294457E+03

12 0.2697688946E+03

Currently, I have textscan skipping the first two lines. My final output goal would looks somethign like the following:

timestep 1, node 1, node 2, ..., node 11, node 12

timestep 2, node 1, node 2, ..., node 11, node 12

Currently, I would like the code to be smart enough to tell the number of nodes that the user supplied (provided in the second line of the above text), and also be able to distinguish between the timestep lines and the node lines.

Any suggestions?

I've attached a example of one of my text files.

1 Comment
Show -1 older comments Hide -1 older comments

per isakson on 18 Jun 2019

See Import Block of Mixed Data from Text File

Search Answers for read text tag:block in the search field in the upper right corner.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

per isakson on 19 Jun 2019

Edited: per isakson on 19 Jun 2019

Open in MATLAB Online

1 vote

An exercise with fscanf()

%%
ffs = "HS_full_18md_nam_outputs.txt";
fid = fopen( ffs, 'r' );
[~] = fgetl( fid );
num = fscanf( fid, '%d%d', [2,1] );
buf = fscanf( fid, ['%f', repmat('%*d%f', 1,num(2) ) ], [num(2)+1,inf] );
[~] = fclose( fid );
out = permute( buf, [2,1] );

peek on the result

>> out(1:3,1:6)
ans =
    8.676e+05   1.9204e-05    3.981e-05   5.6839e-05   7.3688e-05   9.2944e-05
    8.712e+05   2.2396e-05   4.0073e-05   5.1601e-05   6.1428e-05    7.487e-05
    8.748e+05   1.9849e-05   3.1175e-05   4.2591e-05   5.3355e-05   6.5603e-05
>> 

6 Comments
Show 4 older comments Hide 4 older comments

Russell Nasrallah on 20 Jun 2019

Thanks for the response from both of you again.

Per,

When you say "In the code that will be used one month from now..." what do you mean? Is there an update coming that is going to modify the fopen function?

per isakson on 21 Jun 2019

Edited: per isakson on 21 Jun 2019

"In the code that will be used one month from now..."

I try to say that there are two types of code regarding error handling:

Small scripts/functions that you yourself use a few times during a short period of time. In this case it might be ok to skip error checking. Matlab will show more or less relevant error messages at lines several lines "too late".
Scripts/functions that will be used over a longer period of time. In this case error handling with good messages can help find the real cause of the problem quickly.

Sign in to comment.

Answer 2

Walter Roberson on 18 Jun 2019

Open in MATLAB Online

1 vote

fid = fopen('HS_full_18md_nam_outputs.txt');
fgets(fid);   %skip header
ctl = fscanf(fid, '%f%f', 2);
Nt = ctl(1);
Ns = ctl(2);
data = zeros(Nt, Ns+1);
for ts = 1 : Nt
    timestep = fscanf(fid, '%f', 1);
    thisdata = cell2mat(textscan(fid, '%*f%f', Ns));
    data(ts, 1) = timestep;
    data(ts, 2:end) = thisdata;
end
fclose(fid);

4 Comments
Show 2 older comments Hide 2 older comments

Walter Roberson on 19 Jun 2019

Open in MATLAB Online

fscanf and textscan both stop when the size inputs have been satisfied, leaving the input buffer position immediately after the last character that was consumed. That might be in the middle of a line.

Neither function specifically processes line by line. Instead, unless you use uncommon options, both ignore leading whitespace including line boundaries. If for example you ask for 3 numbers then neither function cares whether the input is

1 2 3

Or

   1
   2 3

(note the empty line on input)

There is a difference between the two though. For fscanf the count you provide is the total number of values to read. For textscan the count is the number of times to repeat the format. In cases where a format describes an entire line then typically that can be interpreted as the number of lines to read (not entirely accurate if the values are not in the expected format)

Russell Nasrallah on 20 Jun 2019

Thank you for this clear and concise answer, Walter! I totally understand these tools much better now.

Sign in to comment.

Using Textscan on non-uniform data

1 Comment
Show -1 older comments Hide -1 older comments

Accepted Answer

6 Comments
Show 4 older comments Hide 4 older comments

More Answers (1)

4 Comments
Show 2 older comments Hide 2 older comments

Categories

Tags

Community Treasure Hunt

Using Textscan on non-uniform data

1 Comment Show -1 older comments Hide -1 older comments

Accepted Answer

6 Comments Show 4 older comments Hide 4 older comments

More Answers (1)

4 Comments Show 2 older comments Hide 2 older comments

Categories

Tags

See Also

Community Treasure Hunt

1 Comment
Show -1 older comments Hide -1 older comments

6 Comments
Show 4 older comments Hide 4 older comments

4 Comments
Show 2 older comments Hide 2 older comments