Text file manipulation, find specific lines, find there a string and replace the specific line with new content
1 view (last 30 days)
Show older comments
Hello Community,
Is there a simple way in matlab to perform the following task?
I have a text file that contains of many datasets that looks like that:
#sp|A00G945|CRER-FG Beta-morph B2 chain PS=Legro soma RB=10653 GN=CWERG PE=20 SV=1
CDMSXXXTGHTJKDLGLSKDJGHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFF
GHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFFCDMSXXXTGHTJKDLGLSKDJ
DMSXXXTGHTJKDLGLSKDJDMSXXXTGHTJKDLGLSKDJDMERUTUEZZUUFFF
#sp|P011|CRYAB_Ceta Beta-morph C chain PS=Legro alto RB=5456 GN=CWERF PE=60 SV=2
CDMSXXXTGHTJKDLGLSKDJGHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFF
GHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFFCDMSXXXTGHTJKDLGLSKDJ
DMSXXXTGHTJKDLGLSKDJDMSXXXTGHTJKDLGLSKDJDMERUTUEZZUUFFF
...
The beginning of each dataset starts with "#".
I'd like to replace spaces with undescore,
then extract the string between "PS=" and "RB=" (In case of 1. dataset it would give back "Legro_soma"
and replace the line with "#Legro_soma"
fid =fopen('test.txt');
C=textscan(fid,'%s','delimiter','\n');
fclose(fid);
for k=1:numel(C{1,1})
tmp = regexp(C{1,1}(k),'\s'); % find empty spaces
tmp2= strfind(C{1,1}(k),'>');
C{1,1}{k,1}(tmp{1,1}) = '_'; % substitute empty spaces by '_'
end
%Here comes the attemp to extract the string:
idx = strfind(A{u},'PS=');
idend = strfind(A{u},'RB=');
remain = C{1,1}(idx+3:idend-2);
%Replacing the line can be down how?
Thanks for your ideas and help!
0 Comments
Accepted Answer
More Answers (1)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!