Best practice for asynchronous data saving
7 views (last 30 days)
Show older comments
I have a simulation code which exhibits the usual bursty IO behavior of FDTD simulations. To wit,
x = lotsOfData();
for iter = 1:bigNumber
x = takeTimestep(x);
if mod(iter, stepsPerSave) == 0
dumpDataToDisk(x, iter);
end
end
For context, x is expected to potentially be of order 50-100GB. The long gaps during blocking calls to dumpDataToDisk() in which no timesteps can be taken represent FLOPS down the drain, and of course tempus fugit.
Granted that sufficient memory to make a copy of x is available, in e.g. C I would use $THREADING_MODEL create a saver thread, hand that thread a copy of x and let it call dumpDataToDisk() so the main thread can resume calculating. The Google intertube suggests (correct me if I'm wrong!) that
- This isn't an unusual performance problem with Matlab and I/O
- There's several IPC options available: The PCT, or various other socket, mmap/shm or file implementations
- There's no official solution to the asynchronous file I/O question (apparently there is async write to sockets?)
It appears that system("command & ") can provide the fork-and-execute like operation I'd like at a high level, so I might then write e.g.
x = lotsOfData();
for iter = 1:bigNumber
x = takeTimestep(x);
if mod(iter, stepsPerSave) == 0
while checkBusyFileFlag(); pause(.025); end
dumpDataToBurstBuffer(x, iter);
system("~/bufferDrainAssistant.sh & "); % writes "1" to busy file, does mv, writes "0"
end
end
Which appears to provide pretty well what I'd wish for in the threaded model, predicated on the availability of a high-speed IO device. Is there an official or best-practice way to handle this?
0 Comments
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!