How can I align a sequence of three data points?

4 views (last 30 days)
like I have a data sequence
data_A = [0 4 5 7 8 9], data_B = [4 5 7 8 1 5 6], data_C = [ 2 5 3 4 5 7 8]
I want result in this form..............Kindly help................
red square indicate matched sequence........

Accepted Answer

Mathieu NOE
Mathieu NOE on 12 May 2023
hello
try this
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [ 2 5 3 4 5 7 8];
% check that all intersects are identical (make the code robust)
out1 = intersect(data_A,data_B);
out2 = intersect(data_B,data_C);
out3 = intersect(data_A,data_C);
tol = 1e-6;
if (sum(out1-out2) + sum(out1-out3) + sum(out2-out3)) < tol % they are all 3 identical
% find max length of all sequences
n = max([numel(data_A) numel(data_B) numel(data_C)]);
% strfind(A,B) % will find the starting indices where B is embedded in A.
s1 = strfind(data_A,out1);
s2 = strfind(data_B,out1);
s3 = strfind(data_C,out1);
% how much leading zeros do we need ?
m = max([s1 s2 s3]); %
lza = m - s1; % is the number of leading zeros to padd to the data_A sequence
lzb = m - s2; % is the number of leading zeros to padd to the data_A sequence
lzc = m - s3; % is the number of leading zeros to padd to the data_A sequence
data_A = [zeros(1,lza) data_A];
data_B = [zeros(1,lzb) data_B];
data_C = [zeros(1,lzc) data_C];
% how much trailing zeros do we need ?
% find max length of all sequences
q = max([numel(data_A) numel(data_B) numel(data_C)]);
tza = q - numel(data_A); % is the number of trailing zeros to padd to the data_A sequence
tzb = q - numel(data_B); % is the number of trailing zeros to padd to the data_A sequence
tzc = q - numel(data_C); % is the number of trailing zeros to padd to the data_A sequence
data_A = [data_A zeros(1,tza)];
data_B = [data_B zeros(1,tzb)];
data_C = [data_C zeros(1,tzc)];
end
% finally concatenate all 3 sequences
out = [data_A;data_B;data_C]
out = 3×10
0 0 0 4 5 7 8 9 0 0 0 0 0 4 5 7 8 1 5 6 2 5 3 4 5 7 8 0 0 0
  4 Comments

Sign in to comment.

More Answers (2)

Dyuman Joshi
Dyuman Joshi on 12 May 2023
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [2 5 3 4 5 7 8];
Note that this is not the best method to obtain the common sub-sequence, however, this works for this particular data set
y = intersect(intersect(data_A,data_B),data_C)
y = 1×4
4 5 7 8
data = {data_A;data_B;data_C};
%number of elements in each vector
n = cellfun('length', data);
%the starting index of the sub-sequence in each vector
k = cellfun(@(x) strfind(x,y), data);
m = max(k);
nd = numel(data);
%preallocation
out = zeros(nd,max(n+m-k));
for idx = 1:nd
out(idx,m-k(idx)+(1:n(idx))) = data{idx};
end
out
out = 3×10
0 0 0 4 5 7 8 9 0 0 0 0 0 4 5 7 8 1 5 6 2 5 3 4 5 7 8 0 0 0
  2 Comments
CM Sahu
CM Sahu on 12 May 2023
Thank you for your answer
but yes this method work only for particular set of data
if data_A = [0 4 5 7 8 9 10];
data_B = [4 5 7 8 1 5 6 10];
data_C = [2 5 3 4 5 7 8 10]; changes
then it will not work because the intersection data 10 is not this [4 5 7 8] sequence.
Dyuman Joshi
Dyuman Joshi on 12 May 2023
Yes, the trickiest part of this question is obtaining the common sub-sequence.
I will update my answer, if I am able to find a solution to that.

Sign in to comment.


Shaik
Shaik on 12 May 2023
Hi Sahu,
Hope it helps.
% define the data sequences
data_A = [0 4 5 7 8 9];
data_B = [4 5 7 8 1 5 6];
data_C = [2 5 3 4 5 7 8];
% find the overlapping elements between data_A, data_B, and data_C
overlap_AB = intersect(data_A, data_B);
overlap_AC = intersect(data_A, data_C);
overlap_BC = intersect(data_B, data_C);
overlap_ABC = intersect(overlap_AB, overlap_AC);
overlap_ABC = intersect(overlap_ABC, overlap_BC);
% find the indices of the overlapping elements in data_A, data_B, and data_C
idx_overlap_A = ismember(data_A, overlap_ABC);
idx_overlap_B = ismember(data_B, overlap_ABC);
idx_overlap_C = ismember(data_C, overlap_ABC);
% plot the data sequences with red squares indicating the overlapping elements
plot(1:length(data_A), data_A, '-o', 'Color', 'b');
hold on;
plot(1:length(data_B), data_B, '-o', 'Color', 'g');
plot(1:length(data_C), data_C, '-o', 'Color', 'r');
scatter(find(idx_overlap_A), data_A(idx_overlap_A), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
scatter(find(idx_overlap_B), data_B(idx_overlap_B), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
scatter(find(idx_overlap_C), data_C(idx_overlap_C), 'MarkerFaceColor', 'r', 'MarkerEdgeColor', 'r');
hold off;

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!