How to extract info from string using regexp?

3 views (last 30 days)
I have a text file with a number of lines such as:
[DEBUG][HoraCorrelationMatrixNode] Polygon 2303 id 857befored
[DEBUG][HoraCorrelationMatrixNode] Polygon 2304 id 88befored
[DEBUG][HoraCorrelationMatrixNode] Polygon 2305 id 930befored
[DEBUG][HoraCorrelationMatrixNode] Polygon 2306 id 1000d
[DEBUG][HoraCorrelationMatrixNode] Polygon 2307 id 1001d
I need to extract both polygon number (2303) and its id (857befored). How to write a regular expression pattern to obtain both information?
  1 Comment
Peter Valent
Peter Valent on 22 May 2019
@Stephen Cobeldick: It has to be regular expression beccause the txt file also contains line that are formated diferently. I Think that a regular expression is the best solution for this task. I just didn't know how to write the patterns. But thank you for suggestions.

Sign in to comment.

Accepted Answer

Guillaume on 22 May 2019
This should work:
filecontent = fileread('c:\somewhere\somefile');
polyid = regexp(filecontent, 'Polygon (\d+) id (\S+)', 'tokens');
polyid = vertcat(polyid{:});
result = table(str2double(polyid(:, 1)), polyid(:, 2), 'VariableNames', {'Polygon', 'ID'})

More Answers (1)

Raghunandan V
Raghunandan V on 22 May 2019
Since all the lines are of same format. I would reccomend you to do something like this
str = ['[DEBUG][HoraCorrelationMatrixNode] Polygon 2303 id 857befored']
Polygon_num = str(45:49)
id_num = str(52:end)
This would be easier to implement


Find more on Christmas / Winter in Help Center and File Exchange





Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!