Read values from a very complex txt file into matlab

2 views (last 30 days)
Dear Matlab community,
I have a data file of wave heights that looks like this:
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-19;19:00;;361;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-19;22:00;;363;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-20;01:00;;379;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
Euro platform;Golfhoogte, significante-, uit energiespectrum van 30-500 mHz in cm;1982-11-20;04:00;;381;cm;NVT;Tijdreeks en frequentie analyse, methode CIC/MAREG;Nationaal;Stappenbaak - type Marine 300;NVT;NVT;4230;51.9986111;3.2763889;NVT;NVT,NVT,Niet van toepassing
I am only interested in reading the date (column 3) the hour (column 4) and the wave height, (column 6). The file has 18 columns and is Semicolon(;) separated, so the commas in between the strings can be ignored. There is an empty column among the hour and wave height, that's why wave height is in column 6 and not 5. The length of the file is very big, something like (28years * 365days * 24 h*60 min) in length.
I am using the command:
[data(:,1),data(:,2)...data(:,18)] = textread('wave.txt','%q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q %q','delimiter',';');
This method works but is very very slow, and it gave me some problems with 'buffersize' memory sometimes. Do you guys know a better way to do this? Maybe read only the date and wave heights and dump all the crappy text?

Accepted Answer

Laura Proctor
Laura Proctor on 6 Apr 2011
You can ignore data in textread using the asterisk (*) after the percentage symbol when reading in data.
For example, if you have 4 columns and only wish to read in the first and third columns of data:
data = textread('myFile.txt','%s %*f %f %*f','delimiter',';');
Also, you may wish to look at the documentation for TEXTREAD to see if you can read in the data using better formats other than %q which is used to read in a double quoted string. It seems that you have some numeric data that would be well suited for %f.

More Answers (1)

Pedro Cavaco
Pedro Cavaco on 6 Apr 2011
Thanks Laura for the hint... Simple and efficient.
The %q was just a desperate way I found to allow matlab to read the big strings into a position in the array.
I tried the * and is much faster now.
Greetings

Categories

Find more on Biological and Health Sciences in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!