# How can I align a sequence of three data points?

4 views (last 30 days)
CM Sahu on 12 May 2023
Commented: Mathieu NOE on 15 May 2023
like I have a data sequence
data_A = [0 4 5 7 8 9], data_B = [4 5 7 8 1 5 6], data_C = [ 2 5 3 4 5 7 8]
I want result in this form..............Kindly help................
red square indicate matched sequence........

Mathieu NOE on 12 May 2023
hello
try this
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [ 2 5 3 4 5 7 8];
% check that all intersects are identical (make the code robust)
out1 = intersect(data_A,data_B);
out2 = intersect(data_B,data_C);
out3 = intersect(data_A,data_C);
tol = 1e-6;
if (sum(out1-out2) + sum(out1-out3) + sum(out2-out3)) < tol % they are all 3 identical
% find max length of all sequences
n = max([numel(data_A) numel(data_B) numel(data_C)]);
% strfind(A,B) % will find the starting indices where B is embedded in A.
s1 = strfind(data_A,out1);
s2 = strfind(data_B,out1);
s3 = strfind(data_C,out1);
% how much leading zeros do we need ?
m = max([s1 s2 s3]); %
lza = m - s1; % is the number of leading zeros to padd to the data_A sequence
lzb = m - s2; % is the number of leading zeros to padd to the data_A sequence
lzc = m - s3; % is the number of leading zeros to padd to the data_A sequence
data_A = [zeros(1,lza) data_A];
data_B = [zeros(1,lzb) data_B];
data_C = [zeros(1,lzc) data_C];
% how much trailing zeros do we need ?
% find max length of all sequences
q = max([numel(data_A) numel(data_B) numel(data_C)]);
tza = q - numel(data_A); % is the number of trailing zeros to padd to the data_A sequence
tzb = q - numel(data_B); % is the number of trailing zeros to padd to the data_A sequence
tzc = q - numel(data_C); % is the number of trailing zeros to padd to the data_A sequence
data_A = [data_A zeros(1,tza)];
data_B = [data_B zeros(1,tzb)];
data_C = [data_C zeros(1,tzc)];
end
% finally concatenate all 3 sequences
out = [data_A;data_B;data_C]
out = 3×10
0 0 0 4 5 7 8 9 0 0 0 0 0 4 5 7 8 1 5 6 2 5 3 4 5 7 8 0 0 0
CM Sahu on 15 May 2023
Thank you
Mathieu NOE on 15 May 2023
my pleasure !

Dyuman Joshi on 12 May 2023
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [2 5 3 4 5 7 8];
Note that this is not the best method to obtain the common sub-sequence, however, this works for this particular data set
y = intersect(intersect(data_A,data_B),data_C)
y = 1×4
4 5 7 8
data = {data_A;data_B;data_C};
%number of elements in each vector
n = cellfun('length', data);
%the starting index of the sub-sequence in each vector
k = cellfun(@(x) strfind(x,y), data);
m = max(k);
nd = numel(data);
%preallocation
out = zeros(nd,max(n+m-k));
for idx = 1:nd
out(idx,m-k(idx)+(1:n(idx))) = data{idx};
end
out
out = 3×10
0 0 0 4 5 7 8 9 0 0 0 0 0 4 5 7 8 1 5 6 2 5 3 4 5 7 8 0 0 0
CM Sahu on 12 May 2023
but yes this method work only for particular set of data
if data_A = [0 4 5 7 8 9 10];
data_B = [4 5 7 8 1 5 6 10];
data_C = [2 5 3 4 5 7 8 10]; changes
then it will not work because the intersection data 10 is not this [4 5 7 8] sequence.
Dyuman Joshi on 12 May 2023
Yes, the trickiest part of this question is obtaining the common sub-sequence.
I will update my answer, if I am able to find a solution to that.

Shaik on 12 May 2023
Hi Sahu,
Hope it helps.
% define the data sequences
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [2 5 3 4 5 7 8];
% find the overlapping elements between data_A, data_B, and data_C
overlap_AB = intersect(data_A, data_B);
overlap_AC = intersect(data_A, data_C);
overlap_BC = intersect(data_B, data_C);
overlap_ABC = intersect(overlap_AB, overlap_AC);
overlap_ABC = intersect(overlap_ABC, overlap_BC);
% find the indices of the overlapping elements in data_A, data_B, and data_C
idx_overlap_A = ismember(data_A, overlap_ABC);
idx_overlap_B = ismember(data_B, overlap_ABC);
idx_overlap_C = ismember(data_C, overlap_ABC);
% plot the data sequences with red squares indicating the overlapping elements
plot(1:length(data_A), data_A, '-o', 'Color', 'b');
hold on;
plot(1:length(data_B), data_B, '-o', 'Color', 'g');
plot(1:length(data_C), data_C, '-o', 'Color', 'r');
scatter(find(idx_overlap_A), data_A(idx_overlap_A), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
scatter(find(idx_overlap_B), data_B(idx_overlap_B), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
scatter(find(idx_overlap_C), data_C(idx_overlap_C), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
hold off;
CM Sahu on 12 May 2023