sscanf not reading signed hex strings

It seems that sscanf is always returning saturated unsigned interpretations of hex strings.
text='0xFF884433';
a=sscanf(text,'%i',Inf)
b=sscanf(text,'%x',Inf)
a returns a saturated int32 value (2^31-1) b returns a saturated uint32 value (2^32-1), both are positive, but both are wrong, as the result for signed integer %i with hex interpretation (0x at the beginning) should be -7846861.
Of course the problem can be solved with double(typecast(uint32(sscanf(text, '%x')),'int32')) but I guess this is just a workaround. Is there a possibility to get the right signed numbers from the start?

1 Comment

I don't know C rules well enough to know but I'm like you, it seems by the description in the Matlab documentation that the '%i' form should return a signed value. The problem I think is in what Matlab decides is the class of the return value whereas in C one can specify it. In the doc for the output of fscanf there's the following --
If the format includes:
Only numeric specifiers, A is numeric. If format includes only 64-bit signed integer specifiers, A is of class int64. Similarly, if format includes only 64-bit unsigned integer specifiers, A is of class uint64. Otherwise, A is of class double.
I think the problem is in the size of in64--the input is positive there and afaict there's no way to force a 32-bit assumption in the interpretation with sscanf.
Unfortunately, one still gets the same result if one extends to 64 bits and uses the '%li' form -- the class is returned as int64, but Matlab still interprets as unsigned.
I've had similar difficulties in the past with signed integers and trying to beat Matlab into submission -- I have occasionally resorted to writing a Fortran mex function.
As stated initially, I don't know if the behavior mimics C or not -- it it doesn't I'd venture to call it a bug; if it does then that'll be the TMW position, too, I'd wager.
So, no answer, just my observations...

Sign in to comment.

Answers (1)

MATLAB always treats hex input as non-negative.

8 Comments

dpb
dpb on 5 Mar 2014
Edited: dpb on 6 Mar 2014
Walter, do you happen to know if Matlab behavior for %[l]i is consistent w/ C, hence from whence TMW derived the behavior? It just seems, well, wrong that one can't read a negative integer trivially...
ADDENDUM:
I did finally find a copy of draft C89 -- it says
"x Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 16 for the base argument. The corresponding argument shall be a pointer to unsigned integer."
I guess that means that a "conforming" program isn't allowed to even try return a signed integer if C is like Fortran that it's the responsibility of the programmer to not violate such constraints.
ADDENDUM 2:
That's not the pertinent section...it's '%[l]i]' w/ an 0x prefix that's the problem -- I'll have to go back to the Standard and dig some more.
"i Matches an optionally signed integer, whose format is the same as expected for the subject sequence of the strtol function with the value [1]0 for the base argument. The corresponding argument shall be a pointer to integer."
Note I inserted the '1' above to correct a typo in the draft pdf copy I had to look at...obviously, '0' is not a very interesting base. :) There's nothing I can find that specifically negates keeping the sign when converting a valid form of a constant which is earlier specified as allowing the '0x' prefix. So, my conclusion is "it's broke". :)
In MATLAB, the optional sign can appear in the input, such as '-0x1234'
The optional sign is not an option, as it would require an input format that is not 2's complement.
I haven't seen the non-negative hex interpretation in documentation. It seems the %i specifier for hex is in this case a copy of %x with added saturation- I would never have guessed this behaviour from documentation. I'll wait some more before closing this question...
dpb
dpb on 6 Mar 2014
Edited: dpb on 6 Mar 2014
Hmmm, Walter...That's kewl for the form written as a negated positive value but doesn't help for the internal representation form.
I guess in the end I'm coming to the conclusion the C-folk just think there aren't supposed to be negative hex representations. :(
Once't upon a time many years ago I had a serial A/D device which output a 4-byte hex string. At the time I was using Forth so hadn't come across this exactly until started seeing questions like OPs here.
There is also the %lx specifier for hex returning uint64
No help for the OP's problem. though as it returns same value as does '%x' on the given value, just as a class uint64 instead of the default double.
>> class(sscanf('0xFF884433','%x'))
ans =
double
>> class(sscanf('0xFF884433','%lx'))
ans =
uint64
>> sscanf('0xFF884433','%x')==sscanf('0xFF884433','%lx')
ans =
1
>>
I think I said that ;-)
Oh, I thunk you were implying something different in evaluation rather than just class...sorry, my bad.

Sign in to comment.

Categories

Asked:

on 5 Mar 2014

Commented:

dpb
on 6 Mar 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!