Import text with headers as column values and append to traditional data set

1 view (last 30 days)
Bus141
Bus141 on 9 May 2015
Edited: Stephen23 on 11 May 2015
I would like to import a large number of text files from a simulation program that all have a very similar pattern into one matrix or dataset.
I am trying to pull the numbers from line three into columns, with a column for each value.
(:, Column 1)=78%
(:, Column 2)=3000
(:, Column 3)=1
(:, Column 4)=9
(:, Column 5)=5
(:, Column 6)=300
The headings of each of the values does not change across file types just the numbers do.
Then columns and rows would take the rest of the table.
mat1=(:, [date biomass yield no3() sowing_date surfacep_wt AccumRainfall])

Accepted Answer

Stephen23
Stephen23 on 9 May 2015
Edited: Stephen23 on 11 May 2015
This code reads the complete sample file, and converts those values to numeric:
fid = fopen('test import.txt','rt');
data_title = textscan(fid, 'Title = %s%s%s%s%s%s', 'HeaderLInes',2, 'Delimiter',';');
data_header = textscan(fid, '%s%s%s%s%s%s%s%s', 2, 'MultipleDelimsAsOne',true);
data_values = textscan(fid, '%s%f%f%f%f%f%f%s', 'MultipleDelimsAsOne',true, 'CollectOutput',true);
fclose(fid);
[str,val] = cellfun(@(s)strtok(s,'='), [data_title{:}], 'UniformOutput',false);
idx = cellfun(@(s)strcmp(s(end),'%'),val); % identify percentages
val = cellfun(@(s)sscanf(s,'=%f'),val); % convert to numeric
And we can check these values in the command window:
>> str
str =
'Water' 'Matter' 'Residues' 'fert_amount_sow' [1x20 char] 'tillage_depth'
>> val
val =
78 3000 1 9 5 300
>> idx
idx =
1 0 0 0 0 0
and of course the date and numeric data:
>> data_values
data_values =
{120x1 cell} [120x6 double] {120x1 cell}

More Answers (1)

Walter Roberson
Walter Roberson on 9 May 2015
Use textscan() telling it to skip 6 lines, and use a format of '%s%g%g%g%g%g%g%s' and the CollectOutput option. You will get as output a cell array, the first entry of which is the dates in string format, the second is an N x 6 array of numbers, and the third is the string for AccumRainfall.
You can use datenum() with 'mm/dd/yyyy' format on the first cell array to get MATLAB numeric date.
The proper processing for the AcumRainfall string is not obvious. Should '?' be interpreted as 0, or do you want it to come out as NaN or as some other value? Your sample only shows '?' in that column, so I do not know if the field would say 'Y' if there was rainfall or if it would show a numeric amount. If it is a numeric amount, then you can use str2double() on the cell array of strings: that will convert all of the '?' entries into NaN values and will convert the numeric strings to numbers. The NaN can then be detected (if desired) by using isnan()
  2 Comments
Walter Roberson
Walter Roberson on 9 May 2015
You can textscan() with a string format such as '%s%s%s%s%s%s%s%s', a header skip of 1 line, and a count of 1 lines. That would leave you after line 2, so then you would do a textscan with a header skip of 4 lines and the format I gave about.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!