Error using Unique for table to remove duplicate rows which consists of mixed data types

Dear Experts,
I have a table with mixed data types. I try to remove duplicate rows according to column 1 (string) and column 2 (integer).
ie Inputs: Table X:
Ticker Date Price
AAPL 20160201 100
IBM 20160202 20.1
AAPL 20160202 90
AAPL 20160201 200
Desired output
Ticker Date Price
AAPL 20160201 200
AAPL 20160202 90
IBM 20160202 20.1
My code
[~,b] = unique(X(:,[1 2]), 'rows');
tmp = X(b,:);
but I have this error.
Error using cell/unique (line 85) Input A must be a cell array of strings.
Error in pre_process (line 12)
Thanks

 Accepted Answer

Is your table really a table? Or is it a cell array? The error you're getting implies the latter. As the error message implies, the unique function only works on cell arrays of strings, not of mixed data types. But the table method unique should work fine:
tmp = {...
'Ticker' 'Date' 'Price'
'AAPL' 20160201 100
'IBM' 20160202 20.1
'AAPL' 20160202 90
'AAPL' 20160201 200}
unique(tmp(2:end,:);, 'rows');
Result:
Error using cell/unique (line 85)
Input A must be a cell array of strings.
But:
tbl = cell2table(tmp(2:end,:), 'variablenames', tmp(1,:));
[a,b] = unique(tbl(:,1:2), 'rows')
a =
Ticker Date
______ _________
'AAPL' 2.016e+07
'AAPL' 2.016e+07
'IBM' 2.016e+07
b =
1
3
2

4 Comments

Hi Kelly,
X = curs.Data;
X = array2table(X, 'VariableNames' ,{'ticker', 'date', 'price'});
X = sortrows(X, [1 2]);
[~,b] = unique(X(:,[1 2]), 'rows');
X(b,:)
Yr solution works. Mistake is I used array2table. When I use cell2table, it passed! Indeed expert
Do you know why array2table couldnt work but cell2table work?
In order to use the 'rows' option, all the columns in your table need to be either cell arrays of strings or numeric arrays.
When you use array2table, each of the numeric elements keep their cell array wrapper. So, in your example, tbl.Date would be a 4 x 1 cell array, with each cell holding a single numeric value. If instead you use cell2array, the contents of each cell are added to the table, meaning numeric values are now longer wrapped in a cell, and tbl.Date is a 4 x 1 double array. Which of these two functions is appropriate depends on how you plan to use your table data; in your case, the latter is better.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!