MATLAB Answers

Replacing values in a table

260 views (last 30 days)
I have a table created from a database query using T = select(conn,selectquery); Missing database entries for numeric integer types are read as -2147483648 for long integer types (int32) and -32768 (int16) for short integer types. I would like to replace these with NaN throughout the entire table. If I use I=ismissing(T,[-2147483648 -32768]) I returns logical array with all the table indices where there are missing values as 1=true, in other words, this works. However, if I try to use T = standardizeMissing(T,[-32768 -2147483648]) or any variation, the values are not replaced with NaN as the command would suggest should happen.

  3 Comments

dpb
dpb on 11 Apr 2018
I've yet to fully parse the input description and behavior of ismissing, standardizeMissing and friends, but it appears to me that the standardizeMissing list as shown in all the examples is a "one-at-a-time" operation for each variable type such that in the list of multiple values is not interpreted as two values in the table to substitute but two types with a value each.
See what happens if use
T = standardizeMissing(T,-32768)
T = standardizeMissing(T,-2147483648)
If those work, I'd revert to using isMissing output as the logical addressing vector and just do the assignment.
If they work also I think it's worthy of a support request asking for clarification on syntax and whether are supposed to be able to do in a single step.
Bruce MacWilliams
Bruce MacWilliams on 11 Apr 2018
Thanks for the reply. Your suggestion was my initial try, and what I was referring to when I mentioned "any variation" of the standardizeMissing() command. So, in short, using the "one-at-a-time" approach does not solve this issue.
dpb
dpb on 11 Apr 2018
OK, I see Peter already identified what was going to be my second thought (that didn't come to me until after the posting, I'll admit)

Sign in to comment.

Accepted Answer

Peter Perkins
Peter Perkins on 11 Apr 2018
Are these actual integer types, and not just doubles with integer values in them? Because you can't put NaNs into int16 and int32. standardizeMissing completely ignores integer types:
>> T = table([1;-32768;3;4],[5;6;-2147483648;8])
T =
4×2 table
Var1 Var2
______ ___________
1 5
-32768 6
3 -2.1475e+09
4 8
>> T = standardizeMissing(T,[-32768 -2147483648])
T =
4×2 table
Var1 Var2
____ ____
1 5
NaN 6
3 NaN
4 8
>> T = table(int16([1;-32768;3;4]),int32([5;6;-2147483648;8]))
T =
4×2 table
Var1 Var2
______ ___________
1 5
-32768 6
3 -2147483648
4 8
>> T = standardizeMissing(T,[-32768 -2147483648])
T =
4×2 table
Var1 Var2
______ ___________
1 5
-32768 6
3 -2147483648
4 8

  2 Comments

Bruce MacWilliams
Bruce MacWilliams on 11 Apr 2018
Yes, that is the case, they are int data types, and I was wondering about that but the documentation seems to suggest that int types are acceptable:
Data Types: double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char | string | cell | table | timetable | categorical | datetime | duration
Is there a suggested work around? Convert the columns to a different data type first?
Peter Perkins
Peter Perkins on 13 Apr 2018
I'll make a note to tweak the doc.
There is no "work-around" per se because integer types just don't have the notion of a NaN value. The thing to do is to convert to either 1) floating point, maybe single if memory is an issue 2) categorical, if these are really categorical data (categorical has undefined analogous to NaN)

Sign in to comment.

More Answers (1)

David Fletcher
David Fletcher on 11 Apr 2018
tab =
10×2 table
C1 C2
__ ___________
1 -32768
2 9
3 3
4 2
5 1
6 -2.1475e+09
7 7
8 9
9 1
10 2
standardizeMissing(tab,{-32768,-2147483648},'DataVariables',{'C2'})
ans =
10×2 table
C1 C2
__ ___
1 NaN
2 9
3 3
4 2
5 1
6 NaN
7 7
8 9
9 1
10 2

  2 Comments

David Fletcher
David Fletcher on 11 Apr 2018
See the answer below - these weren't expressly cast to ints so would have been doubles. I still have a nasty habit of typing what I assume is an int and forgetting that Matlab defaults to a double type.
Bruce MacWilliams
Bruce MacWilliams on 11 Apr 2018
David: Thanks for your assistance, when I first saw this your post I tried it, but it did not work and I'm pretty sure I had already attempted the 'DataVariables' argument. Conversion of the table column/variable from int to double solves the problem, so that is what I will do.

Sign in to comment.

Sign in to answer this question.

Products