Find pattern in vector while ignoring/skipping certain indices
Show older comments
Hello,
Is there an efficient way to search for a specific pattern in a mat vector while ignoring some indices in the pattern?
For example, I need to search for a 9-element pattern [0 4 X 0 6 Y 0 8 Z] in a mat vector, where X, Y, Z can be any values.
I currently have a loop based approach but is there a faster vectorized approach?
Thank you.
Answers (4)
Image Analyst
on 11 Jun 2022
I think this should work but for your given pattern, and a vector of 100 million elements of random values, I never did see a match. And I ran it several times. Never found a match so hopefully you believe there should be a match somehow and you're not just using random integers like I did.
% Create sample data.
vec = randi(8, 1, 100000000);
% Define the pattern. Nan = "don't care".
pattern = [0 4 nan 0 6 nan 0 8 nan]
% Define a mask for what values we want to check.
mask = ~isnan(pattern)
lastIndex = length(vec) - length(pattern);
% Scan along the vector looking for matches.
for k = 1 : lastIndex
% Print out progress every 100 thousand window locations.
if mod(k, 100000) == 0
fprintf('k = %d of %d (%.1f%%)\n', k, lastIndex, 100*k/lastIndex);
end
% Extract the window.
thisWindow = vec(k : k+length(pattern)-1);
% Compare this window to our pattern but only at the mask = true locations.
if isequal(pattern(mask), thisWindow(mask))
% Found a match. Report where it was.
fprintf('Match at k = %d where vec = [%d, %d, %d, %d, %d, %d, %d, %d, %d]\n', k, thisWindow)
end
end
fprintf('Done!\n');
1 Comment
Image Analyst
on 11 Jun 2022
If there is a match, it will find it quickly, just like the other solutions since it's basically the same algorithm.
vec=[0 4 1 0 6 5 0 8 7, 3 3 3 , 0 4 2 0 6 4 0 8 6]; %patterns start at i=1 and i=13
pat = [0 4 nan 0 6 nan 0 8 nan];
pat=pat(:); vec=vec(:)';
m=numel(vec); n=numel(pat);
include=find(~isnan(pat));
idx=0:m-n;
sequences = cell2mat(arrayfun(@(i)vec(i+idx),include,'uni',0));
matchlocations=find(all(sequences==pat(include),1) )
I assume it's a vector of integers.
Steve Amphlett showed this trick at comp.soft-sys.matlab twenty years ago.
%% Create sample data
pat = [0,4,nan,0,6,nan,0,8,nan];
msk = true(1,numel(pat));
msk(isnan(pat)) = false;
pat(not(msk)) = 0;
vec = randi([-8,8],1,1e6);
vec(101:109) = [0,4,11,0,6,12,0,8,13];
vec(701:709) = [0,4,14,0,6,15,0,8,16];
%
%% Search matches
tic
z = conv(vec,pat(end:-1:1));
hit = find(abs(z==sum(pat.^2)))-numel(pat)+1;
%%
% hit may contain false hits.
for ix = hit
v9 = vec(ix:ix+8);
if all( v9(msk) == pat(msk) )
disp(ix)
end
end
toc
% the pattern:
pat = [0 4 NaN 0 6 NaN 0 8 NaN];
% create some data containing the pattern:
data = randn(1,10000);
idx = find(~isnan(pat));
for ii = 100:100:9900
data(ii+idx-1) = pat(idx);
end
% find the pattern in the data:
idx = find(~isnan(pat));
result = find(all(data((0:numel(data)-numel(pat)).'+idx) == pat(idx),2));
% display the result:
disp(result);
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!