MATLAB Answers

Regular expression: how to search for a sequence of one number alternated with 0s?

1 view (last 30 days)
Serbring
Serbring on 11 Aug 2021
Commented: Serbring on 17 Aug 2021
Hi all,
I need to craft a regular expression. What I need is to search any pattern where sequences of one number (i.e., 1 or 2) is alternated with a sequence of 0s. So, I search for patterns like the following:
'10111'
'22200002'
'300333'
How can I do it?
Thank you.
Best regards.
MM
  1 Comment
Walter Roberson
Walter Roberson on 12 Aug 2021
Does the sequence always end with the same number? For example would 300330000 be valid? Would 300334 be valid on the grounds that it starts with alternating sequence, and then anything can be after that?
Is "no" alternations also valid? Such as 222 by itself, which is "a sequence of twos, followed by zero repetitions of (0's and 2's) ?
Is the pattern only ever at the beginning of the string, or could it be inside the string as well? If it is inside then that answers the first two questions.

Sign in to comment.

Accepted Answer

DGM
DGM on 12 Aug 2021
Well, I'm certainly not the one to ask about regex. I thought I had this nailed down, but had to resort to this to get it to work in MATLAB. I bet it could be simpler.
vec = '5ad3515505546151g5454651333300342511324sgfb1565440444532152331450005534563asdf445341536404334400044453';
C = regexp(vec,'(\d)\1*0+\1+','match')
C = 1×6 cell array
{'55055'} {'3333003'} {'440444'} {'500055'} {'404'} {'44000444'}
(\d) matches and captures any numeric digit
\1* matches zero or more instances of the captured digit
0+ matches one or more zeros
\1+ matches one or more instances of the captured digit
  1 Comment
Serbring
Serbring on 17 Aug 2021
Thank you so much. Is it possible to introduce tokens in lookahead/lookbehind expressions? I tried with the following, but it does not work. I have read in the web, that certains regular expression engines are not able to deal with tokens in was not able to do it.
C = regexp(S,'(?<=(\d)\1*)0+(?=(\1)+)','match')

Sign in to comment.

More Answers (1)

Stephen
Stephen on 12 Aug 2021
Edited: Stephen on 12 Aug 2021
S = '5ad3515505546151g545460000051333300342511324sgfb15654404440044532152331450005534563asdf4453415364043344004044453';
C = regexp(S,'(\d)\1*(0+\1+)+','match')
C = 1×7 cell array
{'55055'} {'0000'} {'3333003'} {'4404440044'} {'500055'} {'404'} {'440040444'}
Note that \d also matches the zero character. If your definition of "one number" excludes zero, then use this instead:
C = regexp(S,'([1-9])\1*(0+\1+)+','match')
C = 1×6 cell array
{'55055'} {'3333003'} {'4404440044'} {'500055'} {'404'} {'440040444'}
  1 Comment
DGM
DGM on 12 Aug 2021
See, excluding zero is the kind of obvious thing I expected I'd miss.
I didn't even think about continuing the alternation.

Sign in to comment.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!