Help to solve a regexp problem

1 view (last 30 days)
I'm trying to capture comment block in a string (char array). Regular expression '%.*(\n|$)' captures string from % to the end of the line.
regexp(sprintf(' %%this is a comment'),'%.*(\n|$)','match')
ans =
cell
'% this is a comment'
However, what I want to do is to capture multiple incidence of comment blocks in a char array. The above expression only captures the first one, failing to match '% and this'.
regexp(sprintf(' %%this is a comment\n %%and this\n'),'%.*(\n|$)','match')
ans =
cell
'% this is a comment…'
Could somebody help me about this?
  2 Comments
Stephen23
Stephen23 on 8 Jul 2016
Edited: Stephen23 on 8 Jul 2016
To make writing regular expressions easier, you might like to try using my FEX submission that lets you interactively write and develop regular expressions:
Simply run it to open the interactive figure, then enter your data string, and start playing around with the regular expression. The figure will update and show you the regexp outputs as you change the regular expression. This makes it fast and easy to try different regular expressions, and to check how changes affect the outputs.
PS: This page is very useful too. Lots of information, but worth reading and referring to:
Kouichi C. Nakamura
Kouichi C. Nakamura on 8 Jul 2016
This looks really interesting. I'll surely give it a try. I could help me a lot. Thanks!

Sign in to comment.

Accepted Answer

per isakson
per isakson on 8 Jul 2016
Test this
>> str = sprintf(' %%this is a comment\n %%and this\n');
>> cac = regexp( str, '[ ]*%[^\n]+', 'match' )
cac =
' % this is a comment' ' % and this'
>> whos cac
Name Size Bytes Class Attributes
cac 1x2 294 cell
  2 Comments
Kouichi C. Nakamura
Kouichi C. Nakamura on 8 Jul 2016
Thanks a lot. Yours included leading spaces (maybe I was not clear enough about what I want to catch), but '%[^\n]+' successfully did the job!
>> cac = regexp(str, '%[^\n]+', 'match' )
cac =
1×2 cell array
'% this is a comment' '% and this'
Guillaume
Guillaume on 8 Jul 2016
Note that your regular expression would have worked if you'd used the non-greedy *?:
regexp(sprintf(' %%this is a comment\n %%and this\n'),'%.*?(\n|$)','match')
Nonetheless, per's expression is probably more efficient.

Sign in to comment.

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!