How to match two different matrices

Question

0 votes

Hello, I have two matrices of different lengths and this is what the scenario looks like ..

 x = [...]; 
y = [...]; 
size(x) = 5800 * 16 
size(y) = 450 * 14 
% X & Y have dates & times in the first six columns in this form:  
% year, month, day, hour, minute, second 
% Each column represents a variable 
% Each row represents a data sample 
% A model to predict a variable in (X) after some time 
... 
X_time + some_time = predicted_time; % in hours 
% "X_time" is the time of (X) 
% "Y_time" is the time of (Y) 
% Match that predicted time with the time of (Y) within a range of +/- 11 hours 
for i = 1:length(x) 
    for j = 1:length(y) 
        if (predicted_time >= Y_time-11) && (Y_time+11 >= predicted_time) is True 
            MATCHED = [x(i,:) y(j,:) predicted_time]; 
        end 
    end 
end

Please, I want to know how to make this work as I tried a lot but it didn't work properly.

10 Comments
Show 8 older comments Hide 8 older comments

Mohamed Nedal on 17 Jul 2018

image.png

Please check the attached image. This is how the data is presented. The date and time are within the first six columns and the other columns are just variables for storms. Each row represents a data record for one storm with its date, time, and properties.

I have another dataset with a similar style but with a different number of rows and columns.
The first data set represents the storms at some location and the other data set represents the storms arrived at another location (here we're not concerned with the location ^^).
The issue is not all the storms recorded in the first dataset arrived at that location, that's why the data records in the second data set are much fewer than that of the first data set.
So, I need to match each storm in the first data set with its record in the second data set.
I'm using a model to predict the arrival time of the storm, but that model isn't accurate 100%.
So, I put a window for the error with + or - 11 hours (a specific value for that model based on researches). Let me know your thoughts.

jonas on 17 Jul 2018

Edited: jonas on 17 Jul 2018

So, what you need to do is:

Loop through each storm in SET2
Calculate the corresponding time until it reaches the location of SET1
Find the storm that is closest in time to this value

Right?

These are simple steps, and it seems to me that this is almost what Albert Fan proposed some comments ago. If you provide some sample data to work with, I'm sure someone will give you code now that the problem is clearly stated. I guess we also need your model though, unless you can provide the modelled time-slots.

After this discussion, the initial code actually makes some sense :)

Mohamed Nedal on 17 Jul 2018

Open in MATLAB Online

@jonas. Yes, I guess as you said I need to loop through each storm in SET1, calculate the corresponding time until it reaches the location of SET2, and finally find the storm in SET2 that is closest in time to that value.

Kindly find the attached files. The following is the code I wrote so far but still it doesn't work as it should be. The "matched set" gives zeros.

tic 
close all; clear; clc 
%%Read Data 
soho = xlsread('start.xlsx');       % initial storms data 
shocks = xlsread('end.xlsx', 1);    % final storms data 
%%Start Time 
yr1 = soho(4:end,1); 
M1 = soho(4:end,2); 
d1 = soho(4:end,3); 
hh1 = soho(4:end,4); 
mm1 = soho(4:end,5); 
ss1 = soho(4:end,6); 
%%End Time (start of Shocks) 
yr2 = shocks(4:end,1); 
M2 = shocks(4:end,2); 
d2 = shocks(4:end,3); 
hh2 = shocks(4:end,4); 
mm2 = shocks(4:end,5); 
ss2 = shocks(4:end,6); 
%%Parameters 
% CMEs 
CPA = soho(4:end,7); 
w = soho(4:end,8); 
vl = soho(4:end,9); 
vi = soho(4:end,10); 
vf1 = soho(4:end,11); 
v20Rs = soho(4:end,12); 
a1 = soho(4:end,13); 
mass = soho(4:end,14); 
KE = soho(4:end,15); 
MPA = soho(4:end,16); 
% Shocks  
vfinal = shocks(4:end,11); 
T =shocks(4:end,13); 
N = shocks(4:end,14); 
%%Inistial Values 
AU = 149599999.99979659915;  % Sun-Earth distance in km 
d = 0.76 * AU;               % cessation distance in km 
%%Pre-Allocating Variables 
a_calc = zeros(size(vl)); 
squareRoot = zeros(size(vl)); 
A = zeros(size(vl)); 
B = zeros(size(vl)); 
ts = zeros(size(vl)); 
t_hrs = zeros(size(vl));     % predicted transit time in hours 
t_mn = zeros(size(vl));      % predicted transit time in minutes  
%%G2001 Model 
% calculations 
for i = 1:length(vl) 
    a_calc(i) = power(-10,-3) * ((0.0054*vl(i)) - 2.2); % in km/s2 
    squareRoot(i) = sqrt(power(vl(i),2) + (2*a_calc(i)*d)); 
    A(i) = (-vl(i) + squareRoot(i)) / a_calc(i); 
    B(i) = (AU - d) / squareRoot(i); 
    ts(i) = A(i) + B(i);                                % in seconds 
    t_mn(i) = ts(i) / 60;                               % in minutes 
end 
clear i; 
%%Show the predicted travel time 
% CME-ICME Matching 
matchedSet = zeros(length(shocks), 33); % the final set of CME-ICME pairs 
for n = 1:length(yr2) 
    if yr2(n) == yr1(n) 
        for m = 1:length(M2) 
            if M2(m) == M1(m) 
                for k = 1:length(d2) 
                    if d2(k) == d1(k) 
                        if (t_mn(k) >= (((hh2(k)*60)+mm2(k)+(ss2(k)/60))-11)) && ((((hh2(k)*60)+mm2(k)+(ss2(k)/60))+11) >= t_mn(k)) 
                            matchedSet(k, 1:16) = soho(k,:); 
                            matchedSet(k, 18:32) = shocks(k,:); 
                        end 
                    end 
                end 
            end 
        end 
    end 
end 
clear n; clear m; clear k; 
toc

I really appreciate that :)

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

jonas on 18 Jul 2018

Edited: jonas on 18 Jul 2018

Open in MATLAB Online

1 vote

I've made an attempt to fix your code and match the two time-vectors. I've converted your time-vectors to datetime format and fixed your matching-algorithm. The matching works by looping through the modelled time-vector, which is based on the longer time-vector (SET1), and finding the closest match in the smaller time-vector (SET2). A match is only stored if the absolute difference is smaller than 11 hours.

The output, id, is a vector with two columns, where each row [id1 id2] shows the matched indices, i.e. the row of SET1 with the corresponding row of SET2.

NOTE: id is longer than SET2, which would indicate that some elements of SET2 are matched twice.

%%Read Data 
soho = xlsread('start.xlsx');       % initial storms data 
shocks = xlsread('end.xlsx', 1);    % final storms data 
%%Start Time 
%%EDITED  %%
t1=datetime(soho(4:end,1:6));
t2=datetime(shocks(4:end,1:6));
%%ORIGINAL CODE %%
%%Parameters 
% CMEs 
CPA = soho(4:end,7); 
w = soho(4:end,8); 
vl = soho(4:end,9); 
vi = soho(4:end,10); 
vf1 = soho(4:end,11); 
v20Rs = soho(4:end,12); 
a1 = soho(4:end,13); 
mass = soho(4:end,14); 
KE = soho(4:end,15); 
MPA = soho(4:end,16); 
% Shocks  
vfinal = shocks(4:end,11); 
T =shocks(4:end,13); 
N = shocks(4:end,14); 
%%Inistial Values 
AU = 149599999.99979659915;  % Sun-Earth distance in km 
d = 0.76 * AU;               % cessation distance in km 
%%Pre-Allocating Variables 
a_calc = zeros(size(vl)); 
squareRoot = zeros(size(vl)); 
A = zeros(size(vl)); 
B = zeros(size(vl)); 
ts = zeros(size(vl)); 
t_hrs = zeros(size(vl));     % predicted transit time in hours 
t_mn = zeros(size(vl));      % predicted transit time in minutes  
%%G2001 Model 
% calculations 
for i = 1:length(vl) 
    a_calc(i) = power(-10,-3) * ((0.0054*vl(i)) - 2.2); % in km/s2 
    squareRoot(i) = sqrt(power(vl(i),2) + (2*a_calc(i)*d)); 
    A(i) = (-vl(i) + squareRoot(i)) / a_calc(i); 
    B(i) = (AU - d) / squareRoot(i); 
    ts(i) = A(i) + B(i);                                % in seconds 
    t_mn(i) = ts(i) / 60;                               % in minutes 
end
clear i;
t_model=t1+minutes(t_mn);
%%Show the predicted travel time 
% CME-ICME Matching 
%%EDITED FROM HERE AND ON %%
id=nan(numel(t_model),2);
for i=1:numel(t_model)
    [MinDiff ind]=min(abs(t2-t_model(i)));
    if MinDiff<hours(11)
        id(i,1:2)=[i ind];
    else
        id(i,1:2)=[i NaN];
    end
end
id(isnan(id(:,2)),:)=[];
plot(id(:,1),id(:,2),'.')

8 Comments
Show 6 older comments Hide 6 older comments

Mohamed Nedal on 19 Jul 2018

Okay, I'll do that.

Thank you so much for your help.

I'll use this code in the analysis phase of my paper and I was thinking of acknowledging you if you don't mind.

If it's okay with you, please send me your information such as the first name, the last name, the organization, the specialty, and the email address.

jonas on 19 Jul 2018

That's really kind of you, I'm flattered. However, this is fairly standard stuff so there is absolutely no need for me to take up space in your paper. I'm a final year PhD student myself and, although I have zero questions asked, this forum has helped me a ton throughout my work. I'm just happy to give something back.

Btw, as a final note. Since this is for a scientific paper, don't forget that if the error is larger than 11 hours, then it's not included as a match. Also double-check why the "matched" number of entries is larger than the total number of entries in SET2 (I think it's about 550 matches compared to 450 unique entries in SET2).

Good luck in your work and let me know if you need more help!

Sign in to comment.

How to match two different matrices

10 Comments
Show 8 older comments Hide 8 older comments

Accepted Answer

8 Comments
Show 6 older comments Hide 6 older comments

More Answers (0)

Categories

Products

Tags

Community Treasure Hunt

How to match two different matrices

10 Comments Show 8 older comments Hide 8 older comments

Accepted Answer

8 Comments Show 6 older comments Hide 6 older comments

More Answers (0)

Categories

Products

Tags

See Also

Community Treasure Hunt

10 Comments
Show 8 older comments Hide 8 older comments

8 Comments
Show 6 older comments Hide 6 older comments