Mailing List Archive: Re: Printf

[REBOL] Re: Printf

From: tim:johnsons-web at: 23-Dec-2003 8:13


Hello List:
    This was sent back to me OTL. Since I think it is a valuable
    contribution on the subject of 'parse and advises regarding
    some pitfalls, I'm posting this back to the list for archival
    purposes and adding some comments.

    Really good contribution.
    Thanks Tom
    tj

* Tom Conlin <[tomc--darkwing--uoregon--edu]> [031222 20:38]:
> Hi Tim
>
> I just looked at your parses
> and tho I do enjoy parse I do not consider myself guru.
>
> but I will go ahead and flap my lips and send untested code in hopes
> it helps cause I am apt to become too caught up in festivities soon
> to be of much help
>
> from your first post here are some thoughts of how I would do it
> differently ... if I was going to do it the way you did...
>
> ; Replace "%s" with members of a block. In the spirit
> ; of C, LISP, python format strings
> printf: make function [ str[string!] subs[block!] ]
>         [delims non-delims blk] ; drop ndx,  let your data index itself
> [
>     delims: charset "%s"  ;;; DANGER sets do not respect ordering
>     non-delims: complement delims

   Not pre-allocating introduces the possibility of high overhead
   in interative blocks..... Very good point. In my non-parse function,
   I wrap the function in an anonymous object, intiallize an 'alloc
   word to a value of 128, and make algorithmic adjustments to that
   size based on the length of the return value. (which is a string)

>     ;;;it starts a string and ends a string why a block in the middle?
>     ;;; pre-allocating the correct length is golden
>     ;;; and this result can't get any longer
>     blk: make string! add length? str length? to string! subs

Also, it is arguably not good programming practice to use the 'circular
approach to the 'subs. After all, I've been programming with fixed
replacement lists in C and python for years and even GCC gets grumpy
about tokens and replacement lists being of a different length.
And since rebol will cheerfully insert a
none as a string subset, then I might end up
error trapping any overruns and dumping the 'circular approach.

That would eliminate the 'guessing'....

>     parse/all str[
>             some [
>                     copy txt any non-delims (
> ;;;  use 'any in place of 'some  because  what if str had
> ;;; ...%s%s... ? 'some would fail with nothing between them
>                             insert tail blk txt
> ;;; use "insert tail"  instead of 'append its a standard speed-up
>                             if not tail? subs[
> ;;; see maw no dangeling indexes
>                             	insert tail blk pick subs 1
> ;;; 'pick o n  or 'first o are faster than o/:ndx
>                             	subs: next subs ; itterate over subs
>                             ]) |
>                             delims ; NOT GOOD ... eats a single #"%" or
> #"s".
>                             ; which in combo with the enclosing 'some will
> eat the entire string "sssssssssssssssss%%%%%%%%%%%%%%%%"
>                             ; fatal error
>                     ] ; end some
>             ]
>         rejoin blk
> ];end func
>
> ;;; your second function has caught some of the problems from the first.
>
> printf2: func[str [string!] subs [block!] /local delim blk ndx x][
>     delim: "%s"      ; keeps the structure of the delimiter-- way good
>     blk: copy []
>     ndx: 1
>     parse/all str [
>         some [
>             delim
>                 (
>                 append blk subs/:ndx
>                 either ndx < (length? subs)[
>                     ndx: ndx + 1
>                     ][ ; more delims that subs
>                     ndx: 1 ; 'wrap' subs
>                     ]
>                 )
>                 |
> ;;; again not good  ... what is in x?  put a print x  in there and see
> ;;; you are copying the non-delim chars to your blk one char at a time ...
> ;;; because copy (in parse) uses its second arg to determine how much to
> ;;; put in x and 'skip does one char by default...
>
>                 copy x skip (append blk x)
>
>                 ] ; end some
>             ]; end parse
>         rejoin blk
>         ];end func
>
> ;;; an improved vesion,
> ;;; keeps the flavor you had going but should be free of structural errors

  Tom's revisions below:

> printf22: func[str [string!] subs [block!] /local delim blk x][
>     delim: "%s"
>
> ;;; this is not quite as good anymore
> ;;;  with subs being circular final length is not exactly known.
> ;;; (without more work)
>     blk: make string! add length? str length? to string! subs
>
>     parse/all str [
>         some[
>             delim
>             (   insert tail blk first subs
>                 if tail? subs: next subs[subs: head subs]; 'wrap' subs
>             )
>             |
>             copy x [to delim] ;;; specify everything to the next delim
>             (insert tail blk x)
>          ] ; end some
>
>          copy x [to end]   (insert tail blk x)
>          ;;; otherwise you loose potential text after the last %s
>          ;;; if %s was last you append nothing which doesnt hurt
>    ]; end parse
>    blk
> ];end func
>

  Different approach here: :-) Since I started programming on a
  7.5 mhz 8086 with 640k of RAM, I'd be one to keep this out
  of big loops....

> if I were writing it to begin with I would probable try to go
> more in the direction of replace in place ...
> has the disadvantage of having to change the strings size for each delim
> has the advantage of only needing one copy in memory
>
> ;;; if the input string can be changed
> ;;; or is huge and you want want two copies...
>
> printf-inplace-change: func [
> 			str[string!] subs[block!]
> 			/locals delims mark
> ][
>         delims: "%s"
>         len-dem: length? delims
>         parse/all str[
>                 some [
>            	to delims
>            	mark:   ; parse at beginning of replacement text
>            	(	change/part :mark first subs len-dem
>            		mark: skip :mark length? to string! first subs
> 			;unless yer subs have embedded delims ...
>            		if tail? subs: next subs[subs: head subs]
> 			; circular replacemnt block
>            	)
>            	:mark   ; parse at end of replaced text
>            ]
>        ]
>        str
> ]
>

--
Tim Johnson <[tim--johnsons-web--com]>
      http://www.alaska-internet-solutions.com