More efficient returning string position in cell string array
3 views (last 30 days)
Show older comments
Given an array such as
>> whos endtxt
Name Size Bytes Class Attributes
endtxt 137x2 22466 cell
Surely there is a (much) less verbose way to return the location of a specific string within the array than
>> find(~cellfun('isempty',strfind(endtxt(:,2),'FULLER')))
ans =
46
>>
I've whiffed on an efficient way to do something useful with the cell array of a zillion empty cells excepting for the one(s) of interest...the above does (finally!) work, but surely????
2 Comments
Answers (3)
Guillaume
on 8 Oct 2014
Assuming that the string you're looking for ('FULLER') is the exact match for one of the string in the cell array (and not just a substring), then
find(ismember(endtxt(:, 2), 'FULLER'))
matt dash
on 8 Oct 2014
If i'm understanding correctly: [junk,answer] = ismember('FULLER',endtxt(:,2))
2 Comments
matt dash
on 8 Oct 2014
Well, here is an option with even more rigamarole, but it is faster if that matters. Basically make a copy of your text that is not a cell array, and cross reference it with a vector indicating where row breaks occur. For potentially even more speed you could remove the find entirely by keeping a vector of row indices for every character in the text (if memory is not an issue)
1) use cellfun(@length,<cells>) to get the length of each cell, then cumsum this to get the start index of each line (pre-pend a 0 at the beginning) 2) convert the cell arrays to one long string with [<cells>{:}] 3) now just use strfind on this one string to get the index 4) cross reference this with the index vector from (1) to see which line it begins in
Ridiculous, but on my computer it is 10-30x faster than find(~cellfun('isempty',strfind(lines,teststr)))
and seems to get faster for larger amounts of text.
code:
fid = fopen('book.txt','r'); %some long text file
teststr='Lampsacus' %some word in it
%read text file:
tline=fgetl(fid);
lines={};
while ischar(tline)
lines{end+1}=tline;
tline=fgetl(fid);
end
%method 1:
q=cellfun(@length,lines);
starts = [0 cumsum(q)];
alltxt=[lines{:}];
tic
a=strfind(alltxt,teststr);
for i = numel(a):-1:1
idx(i)=find(starts<=a(i),1,'last');
end
toc
%method 2:
tic
find(~cellfun('isempty',strfind(lines,teststr)));
toc
See Also
Categories
Find more on Cell Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!