Read a text file with varying number of colums

36 views (last 30 days)
I am trying to read a text file with varying number of columns, such as this:
#5:v3.0_ST:1:2631.1301,N:140.0:081.000:12.5;
#5:v3.0_ST:1:2631.1301,N:111.4:100.000:12.5:18.7:32.3;
#5:v3.0_ST:1:2631.1299,N:111.5:101.000:12.5:18.7:32.3;
#5:v3.0_ST:1:2631.1315,N:136.4:082.000:12.3;
#5:v3.0_ST:1:2631.1334,N:132.8:083.000:12.4;
The data is delimited by " : " (colon). I understand, there is some way of doing this using textscan, but I do not know how to do it for varying columns. Can someone give me a hint?
Thanks
  4 Comments
Pankaj
Pankaj on 12 Jan 2015
ohh, the comma is for separating coordinate notation(N for North), longitude coordinates were also there. I reduced it to simplified form.
Pankaj
Pankaj on 12 Jan 2015
Thank you all for giving your time for this question.

Sign in to comment.

Accepted Answer

per isakson
per isakson on 12 Jan 2015
Edited: per isakson on 12 Jan 2015
Try
fid = fopen( 'cssm.txt' );
cac = textscan( fid, '%s%s%s%s%s%s%s%s%s', 'CollectOutput' ...
, true, 'Delimiter', ':;' );
[~] = fclose( fid );
it returns with R2013a
>> cac{:}
ans =
Columns 1 through 8
'#5' 'v3.0_ST' '1' '2631.1301,N' '140.0' '081.000' '12.5' ''
'#5' 'v3.0_ST' '1' '2631.1301,N' '111.4' '100.000' '12.5' '18.7'
'#5' 'v3.0_ST' '1' '2631.1299,N' '111.5' '101.000' '12.5' '18.7'
'#5' 'v3.0_ST' '1' '2631.1315,N' '136.4' '082.000' '12.3' ''
'#5' 'v3.0_ST' '1' '2631.1334,N' '132.8' '083.000' '12.4' []
Column 9
''
'32.3'
'32.3'
''
[]
>>
&nbsp
Comments:
  • importdata reads and parses this file in R2013a too. However, textscan is significantly faster.
  • the empty, "[]", at the right bottom corner must be handled separately
  • the format string should be modified to account for the numberical columns
  • I don't think textscan would have behaved this nice some years ago.
&nbsp
Addendum
This might better match what you look for
fid = fopen( 'cssm.txt' );
cac = textscan( fid, '%s%s%f%f%s%f%f%f%f%f' ...
, 'CollectOutput' , true ...
, 'Delimiter' , ':;,' );
[~] = fclose( fid );
Output:
>> cac{:}
ans =
'#5' 'v3.0_ST'
'#5' 'v3.0_ST'
'#5' 'v3.0_ST'
'#5' 'v3.0_ST'
'#5' 'v3.0_ST'
ans =
1.0e+03 *
0.0010 2.6311
0.0010 2.6311
0.0010 2.6311
0.0010 2.6311
0.0010 2.6311
ans =
'N'
'N'
'N'
'N'
'N'
ans =
140.0000 81.0000 12.5000 NaN NaN
111.4000 100.0000 12.5000 18.7000 32.3000
111.5000 101.0000 12.5000 18.7000 32.3000
136.4000 82.0000 12.3000 NaN NaN
132.8000 83.0000 12.4000 NaN NaN
>>

More Answers (1)

Aditya Dua
Aditya Dua on 11 Jan 2015
I tried this on the file segment you posted and it worked:
inp = importdata('file.txt'); where inp(k) contains the k^th line of the text file.
I'm using MATLAB R2014b
Aditya

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!