How to read the data from a header file into single variables
26 views (last 30 days)
Show older comments
I would appreciate some help at reading the data contained in the attached header file into single variables that then I will use to access a file containing the true numbers. Thank you in advace. Best regards, Maura
0 Comments
Accepted Answer
dpb
on 7 May 2016
Edited: dpb
on 14 May 2016
Should get ya' started...
>> c=textread('ma.txt','%s','delimiter','\n','whitespace','');
>> ix=~cellfun(@isempty,(strfind(c,'Number'))) | ~cellfun(@isempty,(strfind(c,'Energy of')));
>> c=c(ix)
c =
'Number of Original Histories: 100'
'Number of Original Histories that Reached Phase Space: 100'
'Number of Scored Particles: 107'
'Number of e-: 4'
'Number of gamma: 1'
'Number of neutron: 2'
'Number of proton: 100'
'Minimum Kinetic Energy of e-: 0.0454162 MeV'
'Minimum Kinetic Energy of gamma: 0.0175963 MeV'
'Minimum Kinetic Energy of neutron: 5.64233 MeV'
'Minimum Kinetic Energy of proton: 73.3641 MeV'
'Maximum Kinetic Energy of e-: 0.223425 MeV'
'Maximum Kinetic Energy of gamma: 0.0175963 MeV'
'Maximum Kinetic Energy of neutron: 49.4473 MeV'
'Maximum Kinetic Energy of proton: 159.678 MeV'
>>
[Elided previous partial solution and background discussion for brevity. dpb]
ADDENDUM
OK, here's a working script in its entirety. While I don't normally recommend it, I used the assignin option here...if run from the command window, the variables will show up in that workspace; if called from a function in that function's workspace of course.
c=textread('ma.txt','%s','delimiter','\n','whitespace','');
c=c(~cellfun(@isempty,(strfind(c,'Number'))) | ...
~cellfun(@isempty,(strfind(c,'Energy of')))); % save only wanted rows
ic=cell2mat(strfind(c,':'))+1; % find the colons plus one char past
particles={'hist';'e-';'gamma';'neutron';'proton'}; % keyword for particles to process
vnames={'histories';'electron';'gamma';'neutron';'proton'}; % output variables for data
cellfun(@(x) str2num(char(x{:})),regexp(c(idx),'(?:(:\s{0,}))(\S)*','tokens'))
Test run...
>> maura % I named the script maura.m, rename as see fit
>> [histories electron gamma neutron proton] % the variables in vnames
ans =
100.0000 4.0000 1.0000 2.0000 100.0000
100.0000 0.0454 0.0176 5.6423 73.3641
107.0000 0.2234 0.0176 49.4473 159.6780
>>
ADDENDUM
OK, with Bruno's help with the regexp pattern, the above internal loop can be eliminated. It's still a little convoluted owing to regexp return a cell of cells instead of simply a cell array of strings so the dereferencing is more complex than would like, but with it the final loop becomes --
for i=1:length(vnames) % get each of the variables values
if i==1
idx=[1:3].'; % histories are first three records
else
idx=find(~cellfun(@isempty,(strfind(c,char(particles(i))))));
end
assignin('caller', ...
char(vnames(i)), ...
cellfun(@(x) str2num(char(x{:})),regexp(c(idx),'(?:(:\s{0,}))(\S)*','tokens')))
end
PS: With respect to John's criticsm on not completing the solution in its entirety on first go but providing the bread crumbs of the technique with an example of converting one line to numeric value thereby hoping to inspire completion on own, I'd not have expected you to come up with this enhancement initially. :)
As an aside, this particular parsing intrigued me as it is a fairly common type of problem -- my goal was to figure out how to write an anonymous function that would eliminate the remaining loop by judicious cellfun and/or arrayfun and friends, but I've so far not succeeded.
7 Comments
dpb
on 11 May 2016
No problem...you did note that I updated the previous ANSWER to include a fully-working script, I hope...
More Answers (1)
John BG
on 9 May 2016
Edited: John BG
on 9 May 2016
My answer supplies measures in numeric format, not char. dpb answer is more code compact but you still have to extract the actual measurement, and translate type from char to double.
a void header file with a list of all variables is needed to tell parameter names from measurements. Find it attached data_void_header.txt
Place the raw data of the question and data_void_beader.txt in same folder as the following script:
% 1.- line 'TOPAS .. ' removed
% 2.- line '.. as follows:' removed
% 3.- void header file has to be prepared in advance with just the parameter names
% 4.- assuming the data, if present, to read is numerical only
% 5.- second parameter 'Number of Original Histories that Reached Phase Space'
% 6.- shortened to 'N of Original Histories that Reached Phase Space' to
% 7.- prevent one variable contains the string of another variable
% get the parameters
fid=fopen('data_void_header.txt');
A={};
tline = fgetl(fid);
while ischar(tline)
A=[A; tline];
tline = fgetl(fid);
end
fclose(fid)
% remove void lines from parameters input
void_lines=0;for k=1:length(A) if isequal(A(k,:),{' '}) void_lines=[void_lines k]; end; end
void_lines(1)=[];void_lines=uint8(void_lines);
for k=1:length(void_lines) A(void_lines(k)-(k-1),:)=[]; end
LA=length(A); % amount of parameters
C={} % answer stored in C
% get the raw input
fid=fopen('data1_raw.txt');
B={};
tline = fgetl(fid);
while ischar(tline)
B=[B; tline];
tline = fgetl(fid);
end
fclose(fid)
B2=cell2mat(B);
% allocate answer: C
C0={'empty'}
for k=1:LA-1
C0=[C0;{'empty'}]
end
C=[A C0]
% sweep all variables and find if there is numeric measurement in raw file
for k=1:LA
P=A{k,:}; % read one parameter from the void header
[s1 s2]=regexp(B2,P); % find out if it the parameter name is in raw data
D=B2([s2+1:s2+1+11]); % the longest parameter value reading are the Kinetic energies with 11 figure-dot
if regexp(D,':')
[cx ai aj]=intersect(D,char([48:57])) % get the figures
cl2=[]
for s=1:length(cx)
[rw cl]=find(D==cx(s))
cl2=[cl2 cl]
end
cl2=sort(cl2)
C(k,2)={cl2}
% C2=[A(k,:) {str2double(D(1,cl2))}]
% C={C;C2}
% no ':' means no value fofr this parameter
% else
% C2=[A(k,:) {'empty'}]
% C={C;C2}
end
end
If you find this answer of any help solving your question,
please click on the thumbs-up vote link,
thanks in advance
John
5 Comments
Walter Roberson
on 14 May 2016
John, you did not give a name for your code; Maura stored it in Read_PhaseSpaceHeader.m
You do use cat: the traceback shows it is being called inside cell2mat, which you do call. In other words, you put together data of incompatible size in the cells and cell2mat() could not convert it.
John BG
on 15 May 2016
Walter and Maura,
my answer to this question does not crash:
1.- download the 3 files I attached in my previous comment: data1_read.m data_void_header.txt and data1_raw.txt
2.- put data1_read.m data_void_header.txt and data1_raw.txt in the same folder that you currently have MATLAB on.
note that as repeatedly told in this forum and by email to Maura, I modified the string the second parameter to save time, to avoid having the name of one variable inside another variable, that may not be great deal add some lines and make more robust and allow variable strings containing the string of other variables, but i just didn't want to spend time on this.
3.- open data1_read.m and run it
this is my command window answer, no cat crashing whatsoever:
>> data1_read
ans =
0
C =
{}
ans =
0
C0 =
'empty'
C0 =
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
'empty'
C0 =
'empty'
..
1.00
cl =
9.00
cl2 =
3.00 4.00 7.00 8.00 9.00
rw =
1.00
cl =
5.00
cl2 =
3.00 4.00 7.00 8.00 9.00 5.00
cl2 =
3.00 4.00 5.00 7.00 8.00 9.00
C =
'Number of Original Histories' [1x3 double]
'N of Original Histories that Reached Pha…' 'empty'
'Number of Scored Particles' [1x3 double]
'Position X (cm)' 'empty'
'Position Y (cm)' 'empty'
'Position Z (cm)' 'empty'
'Direction Cosine X' [ 2.00]
'Direction Cosine Y' [ 2.00]
'Energy (MeV)' 'empty'
'Weight' [ 2.00]
'Particle Type (in PDG Format)' 'empty'
'Flag to tell if Third Direction Cosine i…' 'empty'
'Flag to tell if this is the First Scored…' 'empty'
'TOPAS Time (seconds)' 'empty'
'Time of Flight (nanoseconds)' 'empty'
'Run ID' [1x2 double]
'Event ID' [1x2 double]
'Track ID' [1x2 double]
'Parent ID' [1x2 double]
'Initial Kinetic Energy (MeV)' 'empty'
'Vertex Position X (cm)' 'empty'
'Vertex Position Y (cm)' 'empty'
'Vertex Position Z (cm)' 'empty'
'Initial Direction Cosine X' [1x2 double]
'Initial Direction Cosine Y' [1x2 double]
'Initial Direction Cosine Z' [1x2 double]
'Seed Part 1' [1x2 double]
'Seed Part 2' [1x2 double]
'Seed Part 3' [1x2 double]
'Seed Part 4' 'empty'
'Number of e-' [ 3.00]
'Number of gamma' [ 3.00]
'Number of neutron' [ 3.00]
'Number of proton: 100' 'empty'
'Minimum Kinetic Energy of e-' [1x8 double]
'Minimum Kinetic Energy of gamma' [1x8 double]
'Minimum Kinetic Energy of neutron' [1x6 double]
'Minimum Kinetic Energy of proton' [1x6 double]
'Maximum Kinetic Energy of e-' [1x7 double]
'Maximum Kinetic Energy of gamma' [1x8 double]
'Maximum Kinetic Energy of neutron' [1x6 double]
'Maximum Kinetic Energy of proton' [1x6 double]
>>
See Also
Categories
Find more on Particle & Nuclear Physics in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!