Main Content

split

Split strings at delimiters

Description

newStr = split(str) divides the input text str at whitespace characters and returns the separated text in array newStr. The output newStr does not include the whitespace characters from str.

example

newStr = split(str,delimiter) divides the input text str at the delimiters specified by delimiter. The output newStr does not include the delimiters from str.

example

newStr = split(str,delimiter,dim) divides the input text str at the specified delimiters and places the separated text along the dimension specified by dim.

example

[newStr,match] = split(___) additionally returns an array, match, that contains all occurrences of delimiters at which the split function splits str. You can use this syntax with any of the input arguments of the previous syntaxes.

example

Examples

collapse all

Split names in a string array at whitespace characters. Then reorder the strings and join them so that the last names precede the first names.

Create a 3-by-1 string array containing names.

names = ["Mary Butler";
         "Santiago Marquez";
         "Diana Lee"]
names = 3×1 string
    "Mary Butler"
    "Santiago Marquez"
    "Diana Lee"

Split names at whitespace characters, making it a 3-by-2 string array.

names = split(names)
names = 3×2 string
    "Mary"        "Butler" 
    "Santiago"    "Marquez"
    "Diana"       "Lee"    

Switch the columns of names so that the last names are in the first column. Add a comma after each last name.

names = [names(:,2) names(:,1)];
names(:,1) = names(:,1) + ','
names = 3×2 string
    "Butler,"     "Mary"    
    "Marquez,"    "Santiago"
    "Lee,"        "Diana"   

Join the last and first names. The join function places a space character between the strings it joins. After the join, names is a 3-by-1 string array.

names = join(names)
names = 3×1 string
    "Butler, Mary"
    "Marquez, Santiago"
    "Lee, Diana"

Create a string that contains the path to a folder.

myPath = "/Users/jdoe/My Documents/Examples"
myPath = 
"/Users/jdoe/My Documents/Examples"

Split the path at the / character. split returns myFolders as a 5-by-1 string array. The first string is "" because myPath starts with the / character.

myFolders = split(myPath,"/")
myFolders = 5×1 string
    ""
    "Users"
    "jdoe"
    "My Documents"
    "Examples"

Join myFolders into a new path with \ as the delimiter. Add C: as the beginning of the path.

myNewPath = join(myFolders,"\");
myNewPath = 'C:' + myNewPath
myNewPath = 
"C:\Users\jdoe\My Documents\Examples"

Since R2020b

Get the numbers from a string by treating text as a delimiter. Use a pattern to match the text. Then add up the numbers.

First, create a string that has numbers in it.

str = "10 apples 3 bananas and 5 oranges"
str = 
"10 apples 3 bananas and 5 oranges"

Then, create a pattern that matches a space character or letters.

pat = " " | lettersPattern
pat = pattern
  Matching:

    " " | lettersPattern

Split the string using pat as the delimiter. The empty strings represent splits between spaces and sequences of letters that had nothing else between them. For example, in "10 apples", there is a split before the delimiter " ", and then between " " and "apples". Since there is nothing between the delimiters " " and "apples", the split function returns an empty string to indicate there is nothing between them.

N = split(str,pat)
N = 11×1 string
    "10"
    ""
    ""
    "3"
    ""
    ""
    ""
    ""
    "5"
    ""
    ""

Discard the empty strings and keep the substrings that represent numbers.

N = N(strlength(N) > 0)
N = 3×1 string
    "10"
    "3"
    "5"

Finally, convert N to a numeric array and sum over it.

N = str2double(N);
sum(N)
ans = 
18

For a list of functions that create pattern objects, see pattern.

Create a string.

str = "A horse! A horse! My kingdom for a horse!"
str = 
"A horse! A horse! My kingdom for a horse!"

Split str at exclamation points and at whitespace characters. newStr is a 10-by-1 string array. The last string is an empty string, "", because the last character in str is a delimiter.

newStr = split(str,[" " "!"])
newStr = 12×1 string
    "A"
    "horse"
    ""
    "A"
    "horse"
    ""
    "My"
    "kingdom"
    "for"
    "a"
    "horse"
    ""

Create a string array in which each element contains comma-delimited data about a patient.

patients = ["LastName,Age,Gender,Height,Weight";
            "Adams,47,F,64,123";
            "Jones,,,68,175";
            "King,,M,66,180";
            "Smith,38,F,63,118"]
patients = 5×1 string
    "LastName,Age,Gender,Height,Weight"
    "Adams,47,F,64,123"
    "Jones,,,68,175"
    "King,,M,66,180"
    "Smith,38,F,63,118"

Split the string array. A pair of commas with nothing between them indicates missing data. When split divides on repeated delimiters, it returns empty strings as corresponding elements of the output array.

patients = split(patients,",")
patients = 5×5 string
    "LastName"    "Age"    "Gender"    "Height"    "Weight"
    "Adams"       "47"     "F"         "64"        "123"   
    "Jones"       ""       ""          "68"        "175"   
    "King"        ""       "M"         "66"        "180"   
    "Smith"       "38"     "F"         "63"        "118"   

Create a 3-by-1 string array containing names.

names = ["Mary Butler";
         "Santiago Marquez";
         "Diana Lee"]
names = 3×1 string
    "Mary Butler"
    "Santiago Marquez"
    "Diana Lee"

Split the array at whitespace characters. By default, split orients the output substrings along the first trailing dimension with a size of 1. Because names is a 3-by-1 string array, split orients the substrings along the second dimension of splitNames, that is, the columns.

splitNames = split(names)
splitNames = 3×2 string
    "Mary"        "Butler" 
    "Santiago"    "Marquez"
    "Diana"       "Lee"    

To orient the substrings along the rows, or first dimension, specify the dimension after you specify the delimiter. splitNames is now a 2-by-3 string array, with the first names in the first row and the last names in the second row.

splitNames = split(names," ",1)
splitNames = 2×3 string
    "Mary"      "Santiago"    "Diana"
    "Butler"    "Marquez"     "Lee"  

Create a string.

str = "bacon, lettuce, and tomato"
str = 
"bacon, lettuce, and tomato"

Split str on delimiters. Return the results of the split in a string array, and the delimiters in a second string array. When there is no text between consecutive delimiters, split returns an empty string.

[newStr,match] = split(str,["and" "," " "])
newStr = 7×1 string
    "bacon"
    ""
    "lettuce"
    ""
    ""
    ""
    "tomato"

match = 6×1 string
    ","
    " "
    ","
    " "
    "and"
    " "

Join newStr and match back together with the join function.

originalStr = join(newStr,match)
originalStr = 
"bacon, lettuce, and tomato"

Input Arguments

collapse all

Input text, specified as a string array, character vector, or cell array of character vectors. If str is an array with multiple elements, then each element must contain the same number of substrings.

Delimiting substrings, specified as one of the following:

  • String array

  • Character vector

  • Cell array of character vectors

  • pattern array (since R2020b)

The substrings specified in delimiter do not appear in the output newStr.

Specify multiple delimiters in a string array, cell array of character vectors, or pattern array. The split function splits str on the elements of delimiter. The order in which delimiters appear in delimiter does not matter unless multiple delimiters begin a match at the same character in str. In that case, the split function splits on the first matching delimiter in delimiter.

Example: split(str,{' ',',','--'}) splits str on spaces, commas, and pairs of consecutive dashes.

Dimension along which to place the output substrings, specified as a positive integer. If you do not specify dim, then the default is the first trailing dimension (of the input array str) that has a size of 1.

Output Arguments

collapse all

Substrings split out of the original array, returned as a string array or cell array of character vectors. If the input str is a string array, then so is newStr. Otherwise, newStr is a cell array of character vectors.

The size of newStr depends on the size of str and the dimension argument dim. If str is an array with multiple elements, then each element must contain the same number of substrings (N). With the default dim, the substrings split from each element of the input str are placed along the first trailing dimension of str that has a size of 1:

  • If str is a string scalar or character vector, then newStr is an N-by-1 string array or cell array of character vectors, where N is the number of substrings.

  • If str is an M-by-1 string array or cell array, then newStr is an M-by-N array.

  • If str is a 1-by-M string array or cell array, then newStr is a 1-by-M-by-N array.

  • If str is an M1-by-M2 string array or cell array, then newStr is an M1-by-M2-by-N array.

Identified delimiters, returned as a string array or cell array of character vectors. If the input array str is a string array, then so is match. Otherwise, match is a cell array of character vectors.

match always contains one fewer element than output newStr contains.

Tips

  • If str is a string array or cell array of character vectors, and each array element does not contain the same number of substrings, process the elements one at a time in a for-loop, calling split on each element individually.

Extended Capabilities

expand all

Version History

Introduced in R2016b