Read a text file with varying number of colums

Question

Pankaj on 11 Jan 2015

1
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/169557-read-a-text-file-with-varying-number-of-colums

Commented: Pankaj on 12 Jan 2015

I am trying to read a text file with varying number of columns, such as this:

    #5:v3.0_ST:1:2631.1301,N:140.0:081.000:12.5;
    #5:v3.0_ST:1:2631.1301,N:111.4:100.000:12.5:18.7:32.3;
    #5:v3.0_ST:1:2631.1299,N:111.5:101.000:12.5:18.7:32.3;
    #5:v3.0_ST:1:2631.1315,N:136.4:082.000:12.3;
    #5:v3.0_ST:1:2631.1334,N:132.8:083.000:12.4;

The data is delimited by " : " (colon). I understand, there is some way of doing this using textscan, but I do not know how to do it for varying columns. Can someone give me a hint?

Thanks

4 Comments
Show 2 older commentsHide 2 older comments

Pankaj on 12 Jan 2015

ohh, the comma is for separating coordinate notation(N for North), longitude coordinates were also there. I reduced it to simplified form.

Pankaj on 12 Jan 2015

Thank you all for giving your time for this question.

Sign in to comment.

Sign in to answer this question.

Answer 1

per isakson on 12 Jan 2015

2
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/169557-read-a-text-file-with-varying-number-of-colums#answer_164626

Edited: per isakson on 12 Jan 2015

Open in MATLAB Online

Try

    fid = fopen( 'cssm.txt' );
    cac = textscan( fid, '%s%s%s%s%s%s%s%s%s', 'CollectOutput'  ...
                ,   true, 'Delimiter', ':;'  );
    [~] = fclose( fid );

it returns with R2013a

>> cac{:}
ans = 
  Columns 1 through 8
  '#5'  'v3.0_ST'  '1'  '2631.1301,N'  '140.0'   '081.000'    '12.5'    ''    
  '#5'  'v3.0_ST'  '1'  '2631.1301,N'  '111.4'   '100.000'    '12.5'    '18.7'
  '#5'  'v3.0_ST'  '1'  '2631.1299,N'  '111.5'   '101.000'    '12.5'    '18.7'
  '#5'  'v3.0_ST'  '1'  '2631.1315,N'  '136.4'   '082.000'    '12.3'    ''    
  '#5'  'v3.0_ST'  '1'  '2631.1334,N'  '132.8'   '083.000'    '12.4'        []
  Column 9
    ''    
    '32.3'
    '32.3'
    ''    
        []
>>

&nbsp

Comments:

importdata reads and parses this file in R2013a too. However, textscan is significantly faster.
the empty, "[]", at the right bottom corner must be handled separately
the format string should be modified to account for the numberical columns
I don't think textscan would have behaved this nice some years ago.

&nbsp

Addendum

This might better match what you look for

    fid = fopen( 'cssm.txt' );
    cac = textscan( fid, '%s%s%f%f%s%f%f%f%f%f' ...
                ,   'CollectOutput' ,   true    ...
                ,   'Delimiter'     , ':;,'     );
    [~] = fclose( fid );

Output:

    >> cac{:}
    ans = 
        '#5'    'v3.0_ST'
        '#5'    'v3.0_ST'
        '#5'    'v3.0_ST'
        '#5'    'v3.0_ST'
        '#5'    'v3.0_ST'
    ans =
       1.0e+03 *
        0.0010    2.6311
        0.0010    2.6311
        0.0010    2.6311
        0.0010    2.6311
        0.0010    2.6311
    ans = 
        'N'
        'N'
        'N'
        'N'
        'N'
    ans =
      140.0000   81.0000   12.5000       NaN       NaN
      111.4000  100.0000   12.5000   18.7000   32.3000
      111.5000  101.0000   12.5000   18.7000   32.3000
      136.4000   82.0000   12.3000       NaN       NaN
      132.8000   83.0000   12.4000       NaN       NaN
    >>