Clear Filters
Clear Filters

Removing double empty lines from a text file

2 views (last 30 days)
If a file contains more than one consecutive empty lines, they are replaced by one empty line.
% reading file
fid=fopen(outFile,'rt');
Data = textscan(fid,'%s','Delimiter','\n');
Data=Data{1}; % get rid of nesting
k=1; emptylines_occured=0;
for j=1:numel(Data)
if ~strcmp(Data(j),'') % not empty line
if emptylines_occured
newData{k}=''; k=k+1;
emptylines_occured=0;
end
newData(k)=Data(j); k=k+1;
else % empty line
emptylines_occured=1;
end
end
fclose(fid);
% writing file
fid=fopen(outFile,'wt');
for j=1:numel(newData)
fprintf(fid, '%s\n',newData{j});
end
fclose(fid);
Is there a more concise way?

Accepted Answer

Stephen23
Stephen23 on 8 Feb 2018
Edited: Stephen23 on 9 Feb 2018
You can easily write the new file at the same time as you read the old one, which is faster and uses much less memory. Here is a simple version that create the new file with at most one empty line between any two non-empty lines:
[f1d,msg] = fopen('test_old.txt','rt');
assert(f1d>=3,msg)
[f2d,msg] = fopen('test_new.txt','wt');
assert(f2d>=3,msg)
prv = 'X';
while ~feof(f1d)
new = fgetl(f1d);
if numel(new) || numel(prv)
fprintf(f2d,'%s\n',new);
end
prv = new;
end
fclose(f1d);
fclose(f2d);
The test files are attached. Define prv as an empty char to ignore the leading empty line/s.
  3 Comments
Stephen23
Stephen23 on 8 Feb 2018
@Walter Roberson: the original question uses the t option for both reading and writing, so presumably this is not a problem.
bbb_bbb
bbb_bbb on 9 Feb 2018
Edited: Stephen23 on 9 Feb 2018
This works excellently. Thanks.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 8 Feb 2018
Edited: Walter Roberson on 8 Feb 2018
%read the file _and_ do the work of deleting extra empty lines.
new_text = regexprep( fileread(outFile), '(\r?\n)(\r?\n)+', '$1');
%write the result to a new file
fid = fopen('text_new.txt', 'w');
fwrite(fid, new_text);
fclose(fid)
  3 Comments
Walter Roberson
Walter Roberson on 8 Feb 2018
new_text = regexprep( fileread(outFile), '(\r?\n\r?\n)(\r?\n)+', '$1');
bbb_bbb
bbb_bbb on 8 Feb 2018
Edited: bbb_bbb on 8 Feb 2018
There is still problem with non-english characters. They are turned into 0xFF.

Sign in to comment.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!