next | previous | forward | backward | up | top | index | toc | Macaulay2 web site
Macaulay2Doc > The Macaulay2 language > strings and nets > separate

separate -- split a string into substrings

Synopsis

Description

We illustrate several different ways we can separate the following string into substrings.

i1 : s = "This is an example of a string.\nIt contains some letters, spaces, and punctuation.\r\nIt also contains some new line characters.\r\nIn fact, for some reason, both Unix-style\nand Windows-style\r\nnew line characters are present."

o1 = This is an example of a string.
     It contains some letters, spaces, and punctuation.
     It also contains some new line characters.
     In fact, for some reason, both Unix-style
     and Windows-style
     new line characters are present.

The command separate(s) breaks s at every occurrence of "\r\n" or "\n".

i2 : separate(s)

o2 = {This is an example of a string., It contains some letters, spaces, and punctuation., It also contains some new line
     ----------------------------------------------------------------------------------------------------------------------------
     characters., In fact, for some reason, both Unix-style, and Windows-style, new line characters are present.}

o2 : List

This is equivalent to using the lines function.

i3 : lines s

o3 = {This is an example of a string., It contains some letters, spaces, and punctuation., It also contains some new line
     ----------------------------------------------------------------------------------------------------------------------------
     characters., In fact, for some reason, both Unix-style, and Windows-style, new line characters are present.}

o3 : List

Instead of breaking at new line characters, we can specify which character to break at. For instance, we can separate at every comma:

i4 : separate(",", s)

o4 = {This is an example of a string.,  spaces,  and punctuation.                         ,  for some reason,  both Unix-style   
      It contains some letters                  It also contains some new line characters.                    and Windows-style
                                                In fact                                                       new line characters
     ----------------------------------------------------------------------------------------------------------------------------
                 }

     are present.

o4 : List

or at every space:

i5 : separate(" ", s)

o5 = {This, is, an, example, of, a, string., contains, some, letters,, spaces,, and, punctuation., also, contains, some, new,
                                    It                                               It                                      
     ----------------------------------------------------------------------------------------------------------------------------
     line, characters., fact,, for, some, reason,, both, Unix-style, Windows-style, line, characters, are, present.}
           In                                            and         new

o5 : List

In the last two examples we can see line breaks appear in the output substrings, since we are no longer separating at them. (They are printed in the console as actual new lines, not using escape characters.)

Now let’s try breaking at the string "om". This occurs three times in our string (in three uses of the word "some"), so s is separated into four substrings. The separating characters "om" do not appear in any of the substrings.

i6 : t = separate("om", s)

o6 = {This is an example of a string., e letters, spaces, and punctuation., e new line characters.,
      It contains s                    It also contains s                   In fact, for s         
                                                                                                   
     ----------------------------------------------------------------------------------------------------------------------------
     e reason, both Unix-style       }
     and Windows-style
     new line characters are present.

o6 : List

We can recover the original string using the demark function.

i7 : demark("om", t)

o7 = This is an example of a string.
     It contains some letters, spaces, and punctuation.
     It also contains some new line characters.
     In fact, for some reason, both Unix-style
     and Windows-style
     new line characters are present.

In general, s = demark(x, separate(x, s)). The exception to this rule is that demark("\n", separate(s)) isn’t necessarily equal to s; this code will replace any "\r\n" line breaks in s with "\n" characters.

To use a string longer than 2 characters to separate, and for much greater flexibility and control in specifying separation rules, see separateRegexp.

See also

Ways to use separate :