Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Splitting string based on substring separator

From: andreas:bolka:gmx at: 23-Dec-2002 17:38

Sunday, December 22, 2002, 11:33:07 AM, Gabriele wrote:
> If I was to use FIND, I'd do it this way: > split: func [string delim /local tokens pos] [ > tokens: make block! 32 > while [pos: find string delim] [ > append tokens copy/part string pos > string: skip pos length? delim > ] > append tokens copy string > ]
huh - thanks a lot for opening my eyes for even more tricky find/copy/skip interactions :)
> but I'd prefer a PARSE version anyway: > split: func [string delim /local tokens token] [ > tokens: make block! 32 > parse/all string [ > any [copy token to delim (append tokens token) delim] > copy token to end (append tokens token) > ] > tokens > ]
my benchmarks showed that this 'parse based version is faster than the 'find based one. however, a small omission makes the two versions behave different - the 'parse version inserts 'none tokens when nothing is between two delimiters split ":1::2:" ":" ; == [ none "1" none "2" none ] while the 'find based version inserts empty strings instead (the latter behaviour matching my original intentions). So here it is, the slightly improved (and still _very_ fast) 'parse based split, that handles empty non-tokens nicely: split: func [ string delim /local tokens token ] [ tokens: make block! 32 parse/all string [ any [ copy token to delim (append tokens any [ token "" ]) delim ] copy token to end (append tokens any [ token "" ]) ] tokens ]
> Anyway, I agree with Frantisek Fuka that this should probably be > done natively in PARSE (the real problem is, finding the right > name for the refinement!)
I'd agree to this, /split would look like a straighforward refinement name to me - and I'd also like Gregg's idea, although the resulting breakage of existing scripts may outweigh the benefits. And yes, I'd also like to see a rejoin/with ... I currently work with something I called 'expand: expand: func [ tokens delim /local res token ] [ res: make block! (2 * length? tokens) repeat token tokens [ repend res [ (token) delim ] ] remove back tail res rejoin res ] -- Best regards, Andreas mailto:[andreas--bolka--gmx--net]