How to create a new parameter in one table based on multiple observations in a second table?

2 views (last 30 days)
Hi everyone,
I have two tables:
  1. Basic patient info, incluing a numericar identifier, in which each patient has one row.
  2. Daily patient data, in which each patient (with the same ID) has multiple rows, each one for each day of observation. Coloum 1 is the ID, 2 is the date and 3 and on are events as true/false.
I am trying to create a new parameter in table 1 that will ask if the patient had an event (for example any of the rows of patient 1 on column 3 = true), and a second parameter that will have the earliest date in which that event happened.
I tryed different approached including splitapply and for loop, but cannot get the correct input.
Would really appreciate the help!
Thanks.
  2 Comments
Athrey Ranjith Krishnanunni
Could you explain more about what you mean by "creating a parameter" in a table, in the context of "asking if any of the rows of patient 1 on column 3 = true"?
Do you mean a function that accepts the patient ID and column number of event as input, and returns the date as the output?
Ofer Sadan
Ofer Sadan on 7 Jan 2021
Not exactly, please see Ive's response below. The stucture of the output table is exactly as I meant, apart from filtering out the event=0 cases.

Sign in to comment.

Accepted Answer

Ive J
Ive J on 6 Jan 2021
Edited: Ive J on 6 Jan 2021
Let's call you first able tabc and the latter tabd. What you bascially need is to first select patients with true events, and then keep only the earliest event (with minimum event date); finally, you wanna merge your baseline table (tabc) with this filtered table:
% Patients' characteristics
tabc = table((1:4)', randi([10, 90], 4, 1), randi([23, 40], 4, 1),...
'VariableNames', {'id', 'age', 'bmi'});
tabc =
4×3 table
id age bmi
__ ___ ___
1 16 24
2 72 37
3 83 29
4 53 28
% prepare events table
Y = randi([2015, 2020], 10, 1);
M = randi([1, 12], 10, 1);
D = randi([1, 31], 10, 1);
tabd = table(randi([1, 4], 10, 1), datetime(Y, M, D), ...
randi([0, 1], 10, 1), 'VariableNames', {'id', 'date', 'event'});
tabd =
10×3 table
id date event
__ ___________ _____
3 18-Sep-2019 1
1 16-Jul-2015 0
3 28-May-2015 1
1 25-Jan-2019 1
1 23-Oct-2018 1
3 02-May-2018 0
2 03-Aug-2019 0
3 03-Sep-2019 0
3 25-Feb-2019 1
3 01-Mar-2016 1
% filter events table based on true events only
tabd_filtered = groupfilter(tabd, {'id', 'date'}, @(x) x > 0, 'event');
% keep only the minimum event date for each patient.
tabd_filtered = groupsummary(tabd_filtered, {'id', 'event'}, @min, 'date');
tabd_filtered.Properties.VariableNames = ...
replace(tabd_filtered.Properties.VariableNames, 'fun1_', '');
% merge filtered events table with patients' charactersitics.
final_tab = join(tabd_filtered, tabc);
final_tab.GroupCount = []; % useless column
final_tab =
2×5 table
id event date age bmi
__ _____ ___________ ___ ___
1 1 23-Oct-2018 16 24
3 1 28-May-2015 83 29
  3 Comments
Ive J
Ive J on 7 Jan 2021
Edited: Ive J on 7 Jan 2021
Yes, see groupfilter help for more info. For instance, the example above only keeps those with a true event; but if you wanna keep those only with false events, just change the filtering method:
tabd_filtered = groupfilter(tabd, {'id', 'date'}, @(x) x < 1, 'event'); % or @(x) x == 0

Sign in to comment.

More Answers (0)

Categories

Find more on Tables in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!