Unable to Join Tables and Timetables
Show older comments
I am trying to join two tables and timetables, but I keep getting the following error:
Error using tabular/join (line 144)
The key variable for the right table must have unique values.
Error in JoinTables (line 21)
tabC = join(tabA,tabB); %Trying to join tables
This happens also for the time table line. I thought this was because it wasn't able to identify the proper key variable to index by and so I tried using:
tabC = join(tabA,tabB,'Time',tabA(1,'Time'));
To force it to use the time column, but apparently it's an "invalid parameter name" and so can't recognize it?
Error using tabular/join (line 114)
Invalid parameter name: Time.
So, I tried creating table A, without a time column to get rid of the uniqueness issue, but then it just says there is a common variable to use as a key variable:
Cannot find a common table variable to use as a key variable.
Therefore, it seems like this is a closed loop issue that keeps circling back on itself. This is very frustrating. What am I doing wrong here?
Code:
Input = '2023-01-01 20:14:58';
TimeIn = datetime(Input,'InputFormat','yyyy-MM-dd HH:mm:ss');
datA = [1; 2; 3;];
timA = [TimeIn; TimeIn; TimeIn];
datB = [4; 5; 6];
timB = [TimeIn; TimeIn; TimeIn];
tabA = table(timA,datA);
tabB = table(timB,datB);
tabA = renamevars(tabA,'timA','Time');
tabA = renamevars(tabA,'datA','Data_A');
tabB = renamevars(tabB,'timB','Time');
tabB = renamevars(tabB,'datB','Data_B');
ttabA = table2timetable(tabA);
ttabB = table2timetable(tabB);
%tabC = join(tabA,tabB,'TimA',tabA(1,'TimA')); %Trying to join tables
tabC = join(tabA,tabB); %Trying to join tables
ttabC = join(ttabA,ttabB); %Trying to join time tables
ttabC = synchronize(ttabA,tabB); %Trying to sync time tables
5 Comments
the cyclist
on 7 Mar 2023
It might be easier for you to upload your actual data (or at least a representative sample), so that we do not end up solving a toy example that does not actually help with your data.
I think you may be misunderstanding a few important aspects of joining tables, which mimics this capability in databases:
- The idea that a key variable (or set of variables), by definition, uniquely identifies rows. This is, at least partly, why you were getting the first error. (This requirement of using the simplest syntax for join is stated in the documentation.)
- The idea that to join tables, they must share variable names
- The difference between join and innerjoin, which allows for one-to-many joins, and I think may be what you actually intend (but I'm not sure).
I think uploading a snippet of your actual data, and what you expect the output of the join to be, would be helpful It looks like @Sulaymon Eshkabilov's answer might be helpful regarding my second point above.
John
on 8 Mar 2023
Note that your desired output table has repeated timestamps, which as the cyclist points out means that you do not have unique key values, which means that you are not doing a JOIN operation.
"I then tried explcitly specifying what the key variable names were when calling the join function"
The JOIN documentation currently lists these optional parameter names: "Keys", "LeftKeys", "RightKeys", "LeftVariables", "RightVariables", and "KeepOneCopy". I do not see "Time" in that list. If you want to specify the keys, then use the "Keys" parameter name together with the desired values (i.e. key names), just as the documentation explains and as the documentation examples show.
"When you say they must share variable names, does this mean that all the dimension names must align?"
It is unclear what you mean by "dimension names", I cannot find this term in the MATLAB table documentation.
"I was trying to essentially concatenate the two tables side by side using their "Time" dimension as the key variable since they share the same time indexing."
Horizontal concatenation is easy with your toy examples:
Time = datetime(2023,[1;1;1],1,20,14,58);
datA = [1;2;3];
datB = [4;5;6];
ttA = timetable(Time,datA)
ttB = timetable(Time,datB)
ttZ = [ttA,ttB]
If the tables have different numbers of rows then clearly you will need some kind of JOIN operation, but you would need to consider the fact that your timestamps are duplicated (and ergo are NOT unique, which is a prerequisite for the LHS table in a JOIN operation). This is not a MATLAB problem, but a conceptual problem that you need to understand and figure out first: your statement "they share the same time indexing" is something that does not help: given lots of repeated timestamps, how is MATLAB supposed to uniquely indentify the rows to match if they cannot be uniquely identified? Perhaps you would need e.g. to specify an extra column/variable, e.g. a incremental integer which resets for each timestamp.
Solve that first, then the JOIN/INNERJOIN/whatever is easy.
Lets start with a simple question: do you want to match the rows of the two tables by:
- location (i.e. row 1 matches with row1, row 2 matches with row 2, etc)
- data value (i.e. a unique set of key values in the LH table which correspond to key values in the RH table)
- Or a combination of these, e.g. a timestamps may be duplicated, but for any one timestamp you wish to keep the same relative row order?
"It definitely would be easier and more efficient to upload the actual data to work with, but unfortunately I cannot share that "
Replace your top-secret data with some random numbers and then upload it. Toy examples are rarely representative.
John
on 14 Mar 2023
Accepted Answer
More Answers (0)
Categories
Find more on Tables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
