Split string in two strings

2 views (last 30 days)
Dion Theunissen
Dion Theunissen on 10 Aug 2022
Commented: Stephen23 on 10 Aug 2022
I have the following string, now I want to splits it up in 2 different strings like show in below:
STR = ["van Donk","Gerritsen","kooijman","Verliefde","Floré","Pengel","aan de Wiel","van der Hoeven","Hop","Boer","van Ewijk"]
What i want to create is
STR1 = ["van","","","","","","aan de","van der","","","van"]
STR2 = ["Donk","Gerritsen","kooijman","Verliefde","Floré","Pengel","Wiel","Hoeven","Hop","Boer","Ewijk"]
Anyone who can help me?
  2 Comments
Walter Roberson
Walter Roberson on 10 Aug 2022
Ummm... why? "van der Hoeven" is a complete surname. The surname is not "Hoeven" with "van der" being some kind of middle name. "van der Hoeven" should be sorted under v or V, not under H
Stephen23
Stephen23 on 10 Aug 2022
"The surname is not "Hoeven" with "van der" being some kind of middle name."
The "van der" is not part of the main name, it is a tussenvoegsel:
which in Dutch is ignored when sorting, just like "von" and "zu" are ignored in German.
""van der Hoeven" should be sorted under v or V, not under H"
There are differing opinions on this:
So the required sort order depends mostly on where your users are from.

Sign in to comment.

Answers (1)

Stephen23
Stephen23 on 10 Aug 2022
Edited: Stephen23 on 10 Aug 2022
str = ["van Donk","Gerritsen","kooijman","Verliefde","Floré","Pengel","aan de Wiel","van der Hoeven","Hop","Boer","van Ewijk"]
str = 1×11 string array
"van Donk" "Gerritsen" "kooijman" "Verliefde" "Floré" "Pengel" "aan de Wiel" "van der Hoeven" "Hop" "Boer" "van Ewijk"
tkn = regexp(str,'^(\w+\s+)*(\w+)$','tokens','once');
tkn = vertcat(tkn{:});
st1 = strtrim(tkn(:,1))
st1 = 11×1 string array
"van" "" "" "" "" "" "aan de" "van der" "" "" "van"
st2 = tkn(:,2)
st2 = 11×1 string array
"Donk" "Gerritsen" "kooijman" "Verliefde" "Floré" "Pengel" "Wiel" "Hoeven" "Hop" "Boer" "Ewijk"
  3 Comments
Walter Roberson
Walter Roberson on 10 Aug 2022
(.*)\s+(\S+)
What do you want to do if there are spaces after the last word?
Stephen23
Stephen23 on 10 Aug 2022
str = ["van Donk","Gerritsen","kooijman","Verliefde","Floré","Pengel","aan de Wiel","van der Hoeven","Hop","Boer","van Ewijk","in 't veld"];
tkn = regexp(str,'^(.*?)\s*(\S+)$','tokens','once');
tkn = vertcat(tkn{:})
tkn = 12×2 string array
"van" "Donk" "" "Gerritsen" "" "kooijman" "" "Verliefde" "" "Floré" "" "Pengel" "aan de" "Wiel" "van der" "Hoeven" "" "Hop" "" "Boer" "van" "Ewijk" "in 't" "veld"

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!