MATLAB Answers

0

Extracting consecutive digits using regexp

Asked by Hau Kit Yong on 22 Jun 2019
Latest activity Commented on by per isakson
on 22 Jun 2019
I expected
regexp('ITEM 123', '.+(\d+)', 'tokens')
to return '123'. Why does it only return '3'? What would be the correct expression?

  0 Comments

Sign in to comment.

Tags

1 Answer

Answer by per isakson
on 22 Jun 2019
Edited by per isakson
on 22 Jun 2019
 Accepted Answer

These two returns "123"
%%
cac = regexp('ITEM 123', '.+?(\d+)', 'tokens' )
%%
cac = regexp('ITEM 123', '[^\d]+(\d+)', 'tokens' )
First, '.+' matches anything up til the end of the text, next it gives back just as little as needed to match '(\d+)' , which is one digit.
'.+?' matches as little as needed so that '(\d+)' is able to match the following text.
I prefer '[^\d]+(\d+)'
Or why not just
cac = regexp('ITEM 123', '\d+', 'match' )

  2 Comments

Many thanks! For the first expression, what does the '?' character do? I've only seen it in lookaround operations, but always in the form of '?=', '?<=' etc. and never by itself.
Search Quantifiers and Lazy on the page Regular Expressions

Sign in to comment.