Finding unique values in a table without losing the rest of the data
    57 views (last 30 days)
  
       Show older comments
    
    Philipp Henschel
 on 30 Nov 2017
  
    
    
    
    
    Commented: Elizabeth Reese
    
 on 24 Apr 2019
            I'm working on a table right now, where the rows are equal except for one column, like this (Departure column has "updated" daytimes):
 ID = [153; 153; 153; 153; 124; 124; 124; 211; 211];
Start = {'A';'A';'A';'A';'A';'A';'A';'A';'A'};
End = {'B';'B';'B';'B';'B';'B';'B';'B';'B'};
Departure = datetime({'2014-06-30 12:12'; '2014-06-30 12:15'; '2014-06-30 12:14'; '2014-06-30 12:16'; '2014-06-30 14:32'; '2014-06-30 14:30'; '2014-06-30 14:31'; '2014-06-30 17:31'; '2014-06-30 17:42'});
A =  table(ID, Start, End, Departure)
Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, like i would if i just enter
 C = unique(A.ID);
This way I receive a 3x1 table/vector and the rest of the information is lost.
The Documentation says for this case (I asume): C = unique(A,vars) returns a dataset that contains only one observation for each unique combination of values for the variables in A specified in vars. vars is a positive integer, a vector of positive integers, a variable name, a cell array containing one or more variable names, or a logical vector. C includes all variables from A. The values in C for the variables not specified in vars are taken from the last occurrence among observations in A with each unique combination of values for the variables specified in vars.
Using this explanation:
 C = unique(A, ID);
I received following Error Message: Unrecognized variable name 'Adaptor'.
Please help, if you have any ideas to solve my problem. Thanks
3 Comments
  Stephen23
      
      
 on 30 Nov 2017
				"Now I'm trying to use the unique feature to reduce the table, so that I only have each ID once, but without losing the rest of the information of the table, "
Please explain what you expect to happen with the rest of the data. How would you compress >1 rows into just one row?
Accepted Answer
  Elizabeth Reese
    
 on 5 Dec 2017
        If you want to keep the data from the rows with unique IDs, you can do that using the indexes that unique returns.
In your case,
 [C,IA,IC] = unique(A.ID);
 B = A(IA,:)
You can specify the setOrder or occurrence in unique to determine which Departure date is selected and the order of the IDs in the new table.
2 Comments
  QIAO WANG
 on 24 Apr 2019
				Hi, thank you so much. I benefit a lot from this answer. Besides, I've got another issue when I play with setOrder and occurrence. 
By default, [C,IA,IC] = unique(A.ID); will use 'sorted' and 'first'. However, I want to use 'stable' and 'last' at the same time so that I can get the last occurance of that element in the same order as in the original data. But I got an error, which makes me confused because I assume this is very basic. Sorry, I may explain my question unclearly. I'll show an example here.
>> A
A =
  9×4 table
    ID     Start    End         Departure      
    ___    _____    ___    ____________________
    153     'A'     'B'    30-Jun-2014 12:12:00
    153     'A'     'B'    30-Jun-2014 12:15:00
    153     'A'     'B'    30-Jun-2014 12:14:00
    153     'A'     'B'    30-Jun-2014 12:16:00
    124     'A'     'B'    30-Jun-2014 14:32:00
    124     'A'     'B'    30-Jun-2014 14:30:00
    124     'A'     'B'    30-Jun-2014 14:31:00
    211     'A'     'B'    30-Jun-2014 17:31:00
    211     'A'     'B'    30-Jun-2014 17:42:00
 What I want to obtain is like this (manually made this table). I assumed I could get this table by simply using [C,IA,IC] = unique(A.ID,'stable','last'); B = A(IA,:); However, it didn't work. "You cannot specify 'stable' and 'sorted' with 'first' and 'last'."
   B =
  3×4 table
    ID     Start    End         Departure      
    ___    _____    ___    ___________________
    153     'A'     'B'    30-Jun-2014 12:16:00
    124     'A'     'B'    30-Jun-2014 14:31:00
    211     'A'     'B'    30-Jun-2014 17:42:00
So, can I ask is there any way that can figure this out? I've got a very complicated method which I don't think is efficient. 
  Elizabeth Reese
    
 on 24 Apr 2019
				Please refer to this other MATLAB Answers post regarding "stable" and "last", as specifiying both the "setOrder" and "setOccurence" is not currently supported. There is a workaround on this post and we have recorded the enhancement request for future consideration.
More Answers (1)
  Peter Perkins
    
 on 19 Dec 2017
        It's pretty hard to tell what you mean by, "without losing the rest of the information of the table", but an alternative to Amy's suggestion would be to use rowfun, with ID as the grouping variable, and somehow combine all the data from each set of rows that share the same ID.
0 Comments
See Also
Categories
				Find more on Tables in Help Center and File Exchange
			
	Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!