How to match two different matrices
2 views (last 30 days)
Show older comments
Hello, I have two matrices of different lengths and this is what the scenario looks like ..
x = [...];
y = [...];
size(x) = 5800 * 16
size(y) = 450 * 14
% X & Y have dates & times in the first six columns in this form:
% year, month, day, hour, minute, second
% Each column represents a variable
% Each row represents a data sample
% A model to predict a variable in (X) after some time
...
X_time + some_time = predicted_time; % in hours
% "X_time" is the time of (X)
% "Y_time" is the time of (Y)
% Match that predicted time with the time of (Y) within a range of +/- 11 hours
for i = 1:length(x)
for j = 1:length(y)
if (predicted_time >= Y_time-11) && (Y_time+11 >= predicted_time) is True
MATCHED = [x(i,:) y(j,:) predicted_time];
end
end
end
Please, I want to know how to make this work as I tried a lot but it didn't work properly.
10 Comments
jonas
on 17 Jul 2018
Edited: jonas
on 17 Jul 2018
So, what you need to do is:
- Loop through each storm in SET2
- Calculate the corresponding time until it reaches the location of SET1
- Find the storm that is closest in time to this value
Right?
These are simple steps, and it seems to me that this is almost what Albert Fan proposed some comments ago. If you provide some sample data to work with, I'm sure someone will give you code now that the problem is clearly stated. I guess we also need your model though, unless you can provide the modelled time-slots.
After this discussion, the initial code actually makes some sense :)
Accepted Answer
jonas
on 18 Jul 2018
Edited: jonas
on 18 Jul 2018
I've made an attempt to fix your code and match the two time-vectors. I've converted your time-vectors to datetime format and fixed your matching-algorithm. The matching works by looping through the modelled time-vector, which is based on the longer time-vector (SET1), and finding the closest match in the smaller time-vector (SET2). A match is only stored if the absolute difference is smaller than 11 hours.
The output, id, is a vector with two columns, where each row [id1 id2] shows the matched indices, i.e. the row of SET1 with the corresponding row of SET2.
NOTE: id is longer than SET2, which would indicate that some elements of SET2 are matched twice.
%%Read Data
soho = xlsread('start.xlsx'); % initial storms data
shocks = xlsread('end.xlsx', 1); % final storms data
%%Start Time
%%EDITED %%
t1=datetime(soho(4:end,1:6));
t2=datetime(shocks(4:end,1:6));
%%ORIGINAL CODE %%
%%Parameters
% CMEs
CPA = soho(4:end,7);
w = soho(4:end,8);
vl = soho(4:end,9);
vi = soho(4:end,10);
vf1 = soho(4:end,11);
v20Rs = soho(4:end,12);
a1 = soho(4:end,13);
mass = soho(4:end,14);
KE = soho(4:end,15);
MPA = soho(4:end,16);
% Shocks
vfinal = shocks(4:end,11);
T =shocks(4:end,13);
N = shocks(4:end,14);
%%Inistial Values
AU = 149599999.99979659915; % Sun-Earth distance in km
d = 0.76 * AU; % cessation distance in km
%%Pre-Allocating Variables
a_calc = zeros(size(vl));
squareRoot = zeros(size(vl));
A = zeros(size(vl));
B = zeros(size(vl));
ts = zeros(size(vl));
t_hrs = zeros(size(vl)); % predicted transit time in hours
t_mn = zeros(size(vl)); % predicted transit time in minutes
%%G2001 Model
% calculations
for i = 1:length(vl)
a_calc(i) = power(-10,-3) * ((0.0054*vl(i)) - 2.2); % in km/s2
squareRoot(i) = sqrt(power(vl(i),2) + (2*a_calc(i)*d));
A(i) = (-vl(i) + squareRoot(i)) / a_calc(i);
B(i) = (AU - d) / squareRoot(i);
ts(i) = A(i) + B(i); % in seconds
t_mn(i) = ts(i) / 60; % in minutes
end
clear i;
t_model=t1+minutes(t_mn);
%%Show the predicted travel time
% CME-ICME Matching
%%EDITED FROM HERE AND ON %%
id=nan(numel(t_model),2);
for i=1:numel(t_model)
[MinDiff ind]=min(abs(t2-t_model(i)));
if MinDiff<hours(11)
id(i,1:2)=[i ind];
else
id(i,1:2)=[i NaN];
end
end
id(isnan(id(:,2)),:)=[];
plot(id(:,1),id(:,2),'.')
8 Comments
jonas
on 19 Jul 2018
That's really kind of you, I'm flattered. However, this is fairly standard stuff so there is absolutely no need for me to take up space in your paper. I'm a final year PhD student myself and, although I have zero questions asked, this forum has helped me a ton throughout my work. I'm just happy to give something back.
Btw, as a final note. Since this is for a scientific paper, don't forget that if the error is larger than 11 hours, then it's not included as a match. Also double-check why the "matched" number of entries is larger than the total number of entries in SET2 (I think it's about 550 matches compared to 450 unique entries in SET2).
Good luck in your work and let me know if you need more help!
More Answers (0)
See Also
Categories
Find more on Time Series Objects in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!