Append and add lines in text/dat file

I have a .dat file from CFD postprocessing with variable data regularly listed. I am currently reading through the files fine, but I want to add variable data. I attached an image of the data structure below (this is only an excerpt; the full file is hundreds of thousands of lines, which is the reason that I can't do it manually).
Right now, I am just reading the data. But I will need to add a few things, among them:
  • Two new lines after line 14, but before line 15.
  • Two numbers apppended to the end of each variable line; for example, adding " 0.000000000E+00 0.000000000E+00" to the end of line 21 (the numbers themselves will be calculated using a routine I've already completed)
  • Ignoring lines with two numbers on them.
I have the logic worked out, but I am having trouble a) figuring out how to overwrite the current line/append data to a line itself, and b) formatting numbers the same way: scientific notation with 9 digits (easy), captial E (easy), and a plus/minus with two numbers for the exponent.
I have been using fgetl to read through the data, and the easiest way to do this would be (if appending itself is impossible) to rewrite the line in its entirety by simply setting the current line to a different string. There have been a few posts that I've seen that say this is impossible, but I find this hard to believe. Writing data algorithmically like this seems too easy not to be possible.
Thanks for any help.

 Accepted Answer

dpb
dpb on 29 Jul 2022
Edited: dpb on 29 Jul 2022
"..., and the easiest way to do this would be (if appending itself is impossible) to rewrite the line in its entirety by simply setting the current line to a different string. There have been a few posts that I've seen that say this is impossible, but I find this hard to believe."
Well, believe it in general.
Sequential text files are, well, "sequential" -- they're just a string of bytes -- the only way to replace text in line in a text file (that isn't record oriented as is supported by Fortran, but NOT supported by C/MATLAB) is if the new line is exactly the same number of characters as the line it is replacing. Otherwise it will overwirte subsequent characters if longer or leave old data that was part of the previous line if write fewer characters.
The way to do something like this is either
  1. read the file into memory and operate on it in memory to make the changes desired, then rewrite the modified file in its entirety, or
  2. as you're doing read a record at a time, make any modifications needed, then write the original/modified record to a new file.
With large files that may be too large for memory, the second is the obvious choice...I've written and posted several such filters on Answers in the past; unfortunately I don't have a link to any of them to point to a particular instance.

8 Comments

This is too bad to hear, I was fearing the second answer. Is it possible to simultaneously read and write different files in MATLAB? How would I go about doing this? I think the approach I'd like to take is to read the file, make any changes I need to the current line, and then essentially "paste" that line into the other file. But how to do this in practice doesn't seem clear to me.
Sure, just open one file handle for input, the second for output...
fidi=fopen('inputfile.txt','r');
fido=fopen('outputfile.txt','w');
% copyfile filter
while ~feof(fidi)
l=fgets(fidi);
fprintf(fido,'%s',l)
end
fclose('all')
produces a copy of the input file to a new output file -- simply insert the logic to test for what lines are to be modified, make those modifications (or insert new lines*) before outputting the line.
(*) NB: the above uses fgets that returns the \n character(s) in the orignal file and so spits them back out with fprintf -- if you insert a new line or modify a line, ensure to write the newline when doing so as well.
It's nothing terribly complicated...just that you can't edit in place in the same file without much more grief/difficulty.
Thanks, I will definitely give this a try! But I would be remiss if I didn't ask one more question.
I realized today that I think I can print lines of the proper length, so that it would be an edit of a line, not appending a line. It seems, from reading your answer, that that's possible. Would that be easier? If so, is there a way to overwrite the line itself? Or is this approach above basically what you'd have to do anyway?
dpb
dpb on 31 Jul 2022
Edited: dpb on 31 Jul 2022
It's possible, but more grief -- even if "just" changing a single character in a line, then you also have to reposition the file pointer correctly which is more coding/chance for error and breaks up the continuous reading position so is probably slower besides...
Writing into the original file also almost ensures you'll destroy the original file before you debug the code while you're writing the code so you bestest either be able to regenerate it easily or have secure backup(s) available.
Man, this is a nightmare! Thanks for the help. I'll probably just stick to reading one and writing to the other.
Still amazing to me that doing something like this is this difficult... it seems so simple in principle.
There are computer languages such as sed and perl that specialize in transforming files that can make changing text files easier. However, you will find that they work by building new versions of the file in memory and writing the new version out, or by using the two-file approach.
Making changes inside an existing file is sort of like doing a crossword puzzle in pen: if you get everything right the first time then it works fine, but if you don't then you have a difficult time recovering.
It is simple -- in principle. The problem is in the details and that sequential files are, well, "sequential". Remember they're just a stream of bytes, there's nothing in the file or the OS that discerns any one byte from any other; it's all up to the application code to do with those bytes what it wishes.
Besides the particular languages that have such facilities incorporated into them, there are other languages like Fortran or OS'es that support files with record markers such that you can rewrite records -- but those don't include C and it is the C i/o library that MATLAB uses underneath and TMW has not chosen to date to incorporate them into the language.
It and formatted i/o are two areas from Fortran would have been nice to have brought over from its original FORTRAN heritage.
MATLAB used to be supported on DEC VAX VMS, which is an operating system that really did support changing lines in the middle of text files. Text files were effectively databases with a sequence number for each record, so you could rewrite a line by changing the record associated with that sequence number, and you could insert lines by using an unused sequence number between the existing records.
However, VMS supported two kinds of files, one as described, and one which was just a stream of bytes. And MATLAB never supported operations on the one described above (it required calls to a different I/O library that was specific to VMS and the older DEC RSX operating system.)

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2020b

Asked:

on 28 Jul 2022

Commented:

on 31 Jul 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!