Why does Matlab enforce uniqueness in join

36 views (last 30 days)
Alex
Alex on 6 Apr 2016
Commented: Walter Roberson on 23 May 2021
For example
table1 = table([1 1 2 2].', [0 0 0 0].');
table2 = table([1 1 2 2].', [1 1 1 1].');
table2.Properties.VariableNames{2} = 'Var3';
join(table1, table2)
we get:
>> join(table1, table2)
Error using table/join (line 111)
The key variable for B must have unique values.
I don't find this behaviour desirable as you are able to do a join when the key variables are not unique in SQL.
Is there a reason why MW decided to enforce this condition?

Answers (2)

the cyclist
the cyclist on 6 Apr 2016
I can't speak to the reasoning behind the behavior of the join command.
I can suggest that the innerjoin command probably does what you expect.
The reasoning might be in the documentation. (I did not look carefully, or give it any deep thought.)
  3 Comments
Alex
Alex on 6 Apr 2016
I guess you can use a combination of innerjoin and outerjoin to get around this problem.
Matias Andina
Matias Andina on 17 Jun 2018
I had a similar problem and solved it using
outerjoin(x, y, 'Type', 'Left', 'MergeKeys', true)

Sign in to comment.


r r
r r on 23 May 2021
The problem I used the same method and it appears to me NUN ???
T1 = readtable('R.txt','ReadVariableNames',false)
T1 = 300×1 table
Var1 ____ 628 983 1150 1614 2075 2221 2417 4220 4416 4536 4877 5299 5671 6011 6118 6563
T2 = readtable('ShellE.txt','ReadVariableNames',false)
T2 = 4276×7 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 ____ ____ ____ ______ _______ _______ _______ NaN 69 1206 4.9234 1196.1 -2111.8 431.3 NaN 69 1206 517.61 1259.6 -2185.8 469.27 NaN 143 1382 1329.2 -544.62 1452.1 1998.1 NaN 143 1382 117.42 -546.61 1446.1 2000.3 NaN 298 1283 67.872 2206.7 -285.53 1120.2 NaN 298 1283 427.68 2240.5 -347.8 1138.5 NaN 396 242 22.419 2005.5 257.67 1367.8 NaN 396 242 398.82 2082.5 228.77 1459.8 NaN 423 1019 20.747 273.18 -2041.5 -1333.9 NaN 423 1019 104.6 298.38 -2000.9 -1341.9 NaN 423 601 143.89 432.74 -1530.1 -1918.1 NaN 585 163 93.046 -1583.1 21.189 1918.9 NaN 628 1308 268.32 297.31 2062.7 1359.5 NaN 691 191 51.558 753.84 -1515 1748.4 NaN 691 191 89.18 771.4 -1515.9 1748.3 NaN 743 1217 268.83 -2166 1038.2 605.69
C=outerjoin(T1, T2, 'Type', 'Left', 'MergeKeys', true)
C = 300×7 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 ____ ____ ____ ____ ____ ____ ____ 628 NaN NaN NaN NaN NaN NaN 983 NaN NaN NaN NaN NaN NaN 1150 NaN NaN NaN NaN NaN NaN 1614 NaN NaN NaN NaN NaN NaN 2075 NaN NaN NaN NaN NaN NaN 2221 NaN NaN NaN NaN NaN NaN 2417 NaN NaN NaN NaN NaN NaN 4220 NaN NaN NaN NaN NaN NaN 4416 NaN NaN NaN NaN NaN NaN 4536 NaN NaN NaN NaN NaN NaN 4877 NaN NaN NaN NaN NaN NaN 5299 NaN NaN NaN NaN NaN NaN 5671 NaN NaN NaN NaN NaN NaN 6011 NaN NaN NaN NaN NaN NaN 6118 NaN NaN NaN NaN NaN NaN 6563 NaN NaN NaN NaN NaN NaN
  1 Comment
Walter Roberson
Walter Roberson on 23 May 2021
You need to use 'LeftKeys' and 'RightKeys' because you are not joining on the variable names that are shared between them (such as 'Var1')

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!