Documention for: split.r
Created by: greggirwin
on: 6-May-2012
Format: html
Downloaded on: 28-Mar-2024

REBOL

SPLIT

Contents:

1. SPLIT

1. SPLIT

Given an integer as the dlm parameter, SPLIT will break the series up into pieces of that size.

print mold split "1234567812345678" 4
;== ["1234" "5678" "1234" "5678"]

If the series can't be evenly split, the last value will be shorter.

print mold split "1234567812345678" 3
;== ["123" "456" "781" "234" "567" "8"]
print mold split "1234567812345678" 5
;== ["12345" "67812" "34567" "8"]

Given an integer as dlm, and using the /INTO refinement, it breaks the series into n pieces, rather than pieces of length n.

print mold split/into [1 2 3 4 5 6] 2
;== [[1 2 3] [4 5 6]]
print mold split/into "1234567812345678" 2
;== ["12345678" "12345678"]

If the series can't be evenly split, the last value will be longer.

print mold split/into "1234567812345678" 3
;== ["12345" "67812" "345678"]
print mold split/into "1234567812345678" 5
;== ["123" "456" "781" "234" "5678"]

If dlm is a block containing only integer values, those values determine the size of each piece returned. That is, each piece can be a different size.

print mold split [1 2 3 4 5 6] [2 1 3]
;== [[1 2] [3] [4 5 6]]
print mold split "1234567812345678" [4 4 2 2 1 1 1 1]
;== ["1234" "5678" "12" "34" "5" "6" "7" "8"]
print mold split first [(1 2 3 4 5 6 7 8 9)] 3
;== [(1 2 3) (4 5 6) (7 8 9)]
print mold split #{0102030405060708090A} [4 3 1 2]
;== [#{01020304} #{050607} #{08} #{090A}]

If the total of the dlm sizes is less than the length of the series, the extra data will be ignored.

print mold split [1 2 3 4 5 6] [2 1]
;== [[1 2] [3]]

If you have extra dlm sizes after the series data is exhausted, you will get empty values.

print mold split [1 2 3 4 5 6] [2 1 3 5]
;== [[1 2] [3] [4 5 6] []]

If the last dlm size would return more data than the series contains, it returns all the remaining series data, and no more.

print mold split [1 2 3 4 5 6] [2 1 6]
;== [[1 2] [3] [4 5 6]]

Negative values can be used to skip in the series without returning that part:

print mold split [1 2 3 4 5 6] [2 -2 2]
;== [[1 2] [5 6]]

Char or any-string values can be used for simple splitting, much as you would with parse/all, but with different behavior for strings that have embedded quotes.

print mold split "abc,de,fghi,jk" #","
;== ["abc" "de" "fghi" "jk"]
print mold split "abc<br>de<br>fghi<br>jk"<br>
;== ["abc" "de" "fghi" "jk"]

The following are not supported under R2 yet. Ladislav's PARSE enhancements may be used to support them in the future.

http://www.rebol.org/view-archive-script.r?script=parseen.r&version=2

If you want to split at more than one character value, you can use a charset/bitset.

print mold split "abc|de/fghi:jk" charset "|/:"
;== ["abc" "de" "fghi" "jk"]

And for even more control, you can use simple parse rules.

print mold split "abc^M^Jde^Mfghi^Jjk" [crlf | #"^M" | newline]
;== ["abc" "de" "fghi" "jk"]
print mold split "abc     de fghi  jk" [some #" "]
;== ["abc" "de" "fghi" "jk"]
MakeDoc2 by REBOL - 6-May-2012