How to extract text between two specific words using fread operation

I have an data file where i need to extract the text between two specific words
<Edata
Line1
Line2
Edata>
I need to Extract the text content between two specific words e.g. "<Edata" and "Edata>" using fread operation. This file will be in unknown extension file (*.kcs), so i can read only by using file operations. Any Idea..?

Answers (2)

Read the whole thing in using fread, which I assume you know how to do. Then
word1Location = strfind(theString, '<Edata');
word2Location = strfind(theString, 'Edata>');
inBetweenText = theString(word1Location+6:word2Location-1);

6 Comments

I am having file like this...
<Edata
Line1
Line2
Edata>
<Edata
Line3
Line4
Edata>
I need to capture all the text between these specific words...Its not single word but number of text have to be caputured. How to do that..?
The code I gave did not work? It should work but you'll have to have the last line of code in a loop because the first two lines should return arrays with multiple starting and ending locations. Please show your code so we can debug it.
fid = fopen(OldfilePath,'r');
F = fread(fid);
s = char(F')
word1Location = strfind(s, '<Edata');
word2Location = strfind(s, 'Edata>');
inBetweenText = s(word1Location+4:word2Location-1);
This program returns the only first string but not the remaining strings
I don't think you read my last comment at all. Did you? Where is your loop? And why are you searching for <FBD when you should be searching for <Edata?
Actually i am not trying on original data files. I have taken test data files and trying this code. Sorry for that, Edata or FBD are same as i am trying on two different data files. I have not tried on loop, i know that why i not unable to get the entire array. But i tried with line by line code which doesn't work anyway.
while ischar(tline)
/* Code */
tline = fgets(tus8);
end
Have any idea how to do section by section. I have not tried that...
I was looking for the same answer as to how to loop it. Did you solve it yet? Can u plz share the code if so

Sign in to comment.

theString = fileread(OldfilePath);
parts = regexp(theString, '(?<=<Edata\s*\n).*?(?=\nEdata>)', 'match');
I took the liberty here of removing the end-of-line character after and immediately before Edata . The meaning of "between" was not well defined here: what should happen if there is other text on the same line, like
<Edata %first one
line 1
line 2
Edata>
then should the '%first one' be extracted?

Categories

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!