Clear Filters
Clear Filters

Info

This question is closed. Reopen it to edit or answer.

Load the numeric data of a cyclic text file into a matrix

1 view (last 30 days)
Dear All,
I guess I have to rephrase my question since it has not receive much attention.
I have a text file in the following format:
ITEM: TIMESTEP
0
ITEM: NUMBER OF ATOMS
200
ITEM: BOX BOUNDS pp pp pp
0 23.5
0 23.5
0 23.5
ITEM: ATOMS id type x y z
1 1 4.629738099 19.15100895 8.591289203
2 1 5.379313371 19.12269554 8.727806695
3 2 7.531762324 13.25286645 4.981542453
4 2 7.427444873 13.99400029 5.110889318
ITEM: TIMESTEP
5
ITEM: NUMBER OF ATOMS
200
ITEM: BOX BOUNDS pp pp pp
0 23.5
0 23.5
0 23.5
ITEM: ATOMS id type x y z
1 1 4.602855537 28 8.610593144
2 1 5.399314789 19.12299845 8.70663802
3 2 7.539913654 13.25759311 4.99833023
4 2 7.479249704 13.99259535 5.137606665
The file contains of 6000000 of these cycles. I need to export the numeric data corresponding to the last three columns of each cycle into a matrix for all of the cycles.
In other words my desired output matrix should be in the following format:
4.629738099 19.15100895 8.591289203
5.379313371 19.12269554 8.727806695
7.531762324 13.25286645 4.981542453
7.427444873 13.99400029 5.110889318
4.602855537 28.00000000 8.610593144
5.399314789 19.12299845 8.70663802
7.539913654 13.25759311 4.99833023
7.479249704 13.99259535 5.137606665
As you can see the first 9 lines of each cycle was ignored and added cycles in order to have a target matrix.I do not like to print out this matrix, I just need it for further calculations. I hope you can help me. Thanks

Answers (1)

dpb
dpb on 17 Sep 2015
No matter what you do it likely is going to take a while if the file is that large. But, reading it is pretty straightforward...
fmt=[repmat('%*d',1,2) repmat('%f',1,3)];
N=4; % for the file as shown; I guess it would be 200 for the real file?
fid=fopen('yourfile');
i=0;
while ~feof(fid)
c{i,1}=textscan(fid,fmt,N,'headerlines',9,'collectoutput',1);
end
c=cell2mat(c);
You may speed it up some by preallocating a large "ordinary" array of Nx3, N = #atoms*groups if known and offsetting each portion read by 200 on each pass. Here I would then wrap the textscan call inside cell2mat to convert directly.
N=200; % could open file and read this, too first...
M=6000000; % # time steps in file...
fid=fopen('yourfile');
i1=1; i2=N; % initial indices to array rows
do i=1:6000000
d{i1:i2,:}=cell2mat(textscan(fid,fmt,N, ...
'headerlines',9,'collectoutput',1));
i1=i2+1; i2+i2+N; % increment
end
fid=fclose(fid);
Of course,
>> 6000000 * 200 * 8/1024/1024/1024
ans =
8.9407
>>
9 GB may be more than you can hold in memory at once...

This question is closed.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!