Textscan won't read the dates with spaces

1 view (last 30 days)
D
D on 15 Aug 2017
Edited: D on 23 Aug 2017
Textscan doesn't work when date elements are separated by spaces. However the same format works when used with datetime function. Have a look at the following code:
textscan('1959 05 21','%{yyyy MM dd}D') % Doesn't work when there are spaces
datetime('now','Format','yyyy MM dd') % same format works with datetime function
textscan('1959-05-21','%{yyyy-MM-dd}D') % Works when when space is replaced with non letter character
textscan('1959 05 21','%{yyyy MM dd}D','whitespace','') % Works when whitespace is set to none
textscan('1959 05 21 567','%{yyyy MM dd}D%d','whitespace','') % Doesn't work
How can I make last line of the code to work?
Thanks
PS: Read Walter Roberson and per isakson's comments on the accepted answer
  1 Comment
per isakson
per isakson on 15 Aug 2017
Edited: per isakson on 17 Aug 2017
"How can I make last line of the code to work?" I don't think it's possible. It seems that Matlab cannot handle the double use of space, as part of the date format and at the same time as list separator before the integer.
I assume that your problem is not to parse the string, but to find the limits of textscan.

Sign in to comment.

Accepted Answer

Jeremy Hughes
Jeremy Hughes on 16 Aug 2017
Hi Per,
The issue is that textscan's delimiter is space by default. Parsing happens first, then datatype conversion. In this case, you're getting
"1959 05 21" -> "1959","05","21"
and trying to convert each of these into it own datetime. This is a pretty common confusion.
The trick to parsing this correctly is to supply 'Delimiter' to textscan. Try:
textscan('1959 05 21','%{yyyy MM dd}D','Delimiter',',')
Hope this helps, Jeremy
  2 Comments
Walter Roberson
Walter Roberson on 16 Aug 2017
Note that the attempts
textscan('1959 05 21 567','%{yyyy MM dd}D%d','whitespace','')
or
textscan('1959 05 21 567','%{yyyy MM dd}D%d','Delimiter',',')
will not work, and
textscan('1959 05 21 567','%{yyyy MM dd}D%*[ ]%d','whitespace','')
will not work either. In each of those case, the entire group 1959 05 21 567 gets grabbed and passed to datetime for parsing. textscan parsing is greedy that way, just as is the case for numeric fields:
>> textscan('123456', '%d4%d')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123456', '%d%*[4]%d')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123456', '%d%d','delimiter','4')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123e56', '%de%d')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%*[e]%d')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%d','delimiter','e')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%d','whitespace','e')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
per isakson
per isakson on 16 Aug 2017
Edited: per isakson on 22 Aug 2017
Had the task been to read and parse the string, '1959 05 21 567', I would have tried
cac = textscan('1959 05 21 567','%10c%d');
dt = datetime( cac{1},'Format','yyyy MM dd');
Note: this doesn't work with (a varying number of) leading spaces.
>> textscan(' 1959 05 21 567','%12c%d')
ans =
'1959 05 21 5' [67]
The Conversion Specifier, %c, lets me decide how many characters to read (except for leading (white)spaces) and these two lines are surprisingly efficient. ( %c is cheap.)
However, with 'whitespace','' it works
>> textscan(' 1959 05 21 567','%12c%d', 'Whitespace','')
ans =
' 1959 05 21' [567]
>> textscan(' 1959 05 21 567 890','%12c%d%d', 'whitespace','')
ans =
' 1959 05 21' [567] [890]

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!