[REBOL] Re: Splitting string based on substring separator
From: andreas:bolka:gmx at: 23-Dec-2002 17:38
Sunday, December 22, 2002, 11:33:07 AM, Gabriele wrote:
> If I was to use FIND, I'd do it this way:
> split: func [string delim /local tokens pos] [
> tokens: make block! 32
> while [pos: find string delim] [
> append tokens copy/part string pos
> string: skip pos length? delim
> ]
> append tokens copy string
> ]
huh - thanks a lot for opening my eyes for even more tricky
find/copy/skip interactions :)
> but I'd prefer a PARSE version anyway:
> split: func [string delim /local tokens token] [
> tokens: make block! 32
> parse/all string [
> any [copy token to delim (append tokens token) delim]
> copy token to end (append tokens token)
> ]
> tokens
> ]
my benchmarks showed that this 'parse based version is faster than the
'find based one. however, a small omission makes the two versions
behave different - the 'parse version inserts 'none tokens when
nothing is between two delimiters
split ":1::2:" ":"
; == [ none "1" none "2" none ]
while the 'find based version inserts empty strings instead (the
latter behaviour matching my original intentions).
So here it is, the slightly improved (and still _very_ fast) 'parse
based split, that handles empty non-tokens nicely:
split: func [ string delim /local tokens token ] [
tokens: make block! 32
parse/all string [
any [ copy token to delim (append tokens any [ token "" ]) delim ]
copy token to end (append tokens any [ token "" ])
]
tokens
]
> Anyway, I agree with Frantisek Fuka that this should probably be
> done natively in PARSE (the real problem is, finding the right
> name for the refinement!)
I'd agree to this, /split would look like a straighforward refinement
name to me - and I'd also like Gregg's idea, although the resulting
breakage of existing scripts may outweigh the benefits.
And yes, I'd also like to see a rejoin/with ... I currently work with
something I called 'expand:
expand: func [ tokens delim /local res token ] [
res: make block! (2 * length? tokens)
repeat token tokens [
repend res [ (token) delim ]
]
remove back tail res
rejoin res
]
--
Best regards,
Andreas mailto:[andreas--bolka--gmx--net]