Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] [block-access//slicing] [COLLECT] [BOUNDS] [RANGE] [SLICE] [EXCERPT] (was: Nested blocks...)

From: greggirwin::mindspring::com at: 28-Sep-2004 12:36

Hi All,
>>> array[1..$] >>> array[3..$-5] >>> etc, where '$' represents the end of the array, and '..' is, in case you >>> haven't guessed, "all values between the specified values, inclusive".
LM> this is a different cause. Maxim wanted to have such a syntax too, but a LM> Rebol dialect: LM> copy-array [-3 - tail head - 2] I don't have much time for discussion at the moment, but this is an important topic to me because: 1) I do think we can come up with some very friendly mechanisms for slicing/block-access, 2) I'd like to see a good REBOLish solution, and 3) it can improve how we express certain solutions dramatically (IMO). That said, I've tinkered with some ideas, but felt they weren't ready for public consumption. They still aren't, but I'm going to post them here anyway. :) I was going to present them as part of my dialect talk at DevCon, but there wasn't time. === COLLECT We start with COLLECT. There was some discussion on this type of function before; mine is just a different approach. A similar function could be a valuable addition to REBOL (IMO), but we need to decide on the notation it should use. Mine works *sort of* like REMOVE-EACH in that you tell it what word you want to collect in the block you pass it. The difference is that the word is collected at each point that it occurs as a set-word! in the block. That is, setting the value of that word collects it into the result. The examples should make that clear. Note that this function uses a large number of locals; that's intentional. It's building up a small vocabulary inside the function to make it easier to express the behavior. For example, the action that reads: if at-marker? [replace-marker] is like pseudo-code. The goal is to define little pieces that are easy to reason about instantly and leverage shared information in the operating context. collect: func [ ; a.k.a. gather ? [throw] {Collects block evaluations.} 'word "Word to collect (as a set-word! in the block)" block [block!] "Block to evaluate" /into dest [any-block!] "Where to append results" /only "Insert series results as series" /local code marker at-marker? marker* mark replace-marker rules ] [ block: copy/deep block dest: any [dest copy []] ; insert+tail pays off here over append. ; FIRST BACK allows pass-thru assignment of value. Speed hit though. code: compose [first back (pick [insert insert/only] not only) tail dest] marker: to set-word! word at-marker?: does [mark/1 = marker] ; We have to use change/part since we want to replace only one ; item (the marker), but our code is more than one item long. replace-marker: does [change/part mark code 1] marker*: [mark: set-word! (if at-marker? [replace-marker])] parse block rules: [any [marker* | into rules | skip]] do block head :dest ] tests: collect zz [] collect zz [repeat i 10 [if (zz: i) >= 3 [break]]] collect zz [repeat i 10 [zz: i if i >= 3 [break]]] collect zz [repeat i 10 [either i <= 3 [zz: i][break]]] dest: copy [] collect/into zz [repeat n 10 [zz: n * 100]] dest collect zz [for i 1 10 2 [zz: i * 10]] collect zz [for x 1 10 1 [zz: x]] collect zz [foreach [a b] [1 2 3 4] [zz: a + b]] collect zz [foreach w [a b c d] [zz: w]] collect zz [repeat e [a b c %.txt] [zz: file? e]] iota: func [n [integer!]][collect zz [repeat i n [zz: i]]] iota 10 collect zz [foreach x first system [zz: to-set-word x]] x: first system collect zz [forall x [zz: length? x]] x: first system collect zz [forskip x 2 [zz: length? x]] collect zz [forskip x 2 [zz: (length? x) / 0]] collect/only zz [foreach [a b] [1 2 3 4] [zz: a zz: b zz: reduce [a b a + b]]] collect/only zz [ foreach [a b] [1 2 3 4] [ zz: a zz: b zz: reduce [a b a + b] foreach n reduce [a b a + b] [zz: n * 10] ] ] === BOUNDS BOUNDS is kind of ugly right now. I have a number of experimental versions where I've tried different approaches and dialects. This is the basis for how we specify the boundaries of a range of values. I broke it out as a separate function since it can be applied in different contexts (e.g. RANGE and SLICE). GETing word values, among other features, isn't in this version; part of the design process will be figuring out what the dialect should allow. ; TBD - how to handle an unspecified end value? What to return for it? ; 4.. 12- ; Should time! values be taken as time ranges or should 3:5 be ; the same as [3 to 5]? ; Should we take a series value so we can get the length of it, ; or should we return a special token for the caller to apply? ; What about multiple range specs? [1..4 6..9 12..15] ; What about other separators ; - (dash) ; (0173 soft hyphen) ; (0150 en dash) ; (0151 em dash) bounds: func [ ; a.k.a. bounds [catch throw] bound [block! number! char! money! time! tuple!] ; a.k.a. spec /local a b val val= a-val= b-val= ab-val ] [ val=: [set val [number! | char! | money! | time! | date!]] a-val=: ['from val= (a: val)] b-val=: [['to | '.. | '... | '- | '--] val= (b: val)] ab-val=: [val= (either a [b: val] [a: val])] switch/default type?/word bound [ block! [ if not parse bound [some [a-val= | b-val= | ab-val=]] [ throw make error! "Boundary values must be one of: number! char! money! time! date!" ] a: any [a make b either date? b [now] [1]] if not b [throw make error! "An ending boundary value must be set"] ] tuple! [ if not all [3 = length? bound 0 = bound/2] [ throw make error! "tuple! range values must be of the form m..n" ] a: bound/1 b: bound/3 ] ] [a: make bound 1 b: bound] reduce [a b] ] tests: bounds 1..255 bounds 255..1 bounds 10 bounds [3 10] bounds [4 to 11] bounds [.. 4] bounds [12 ..] bounds [6 .. 12] bounds [7 ... 2] bounds [8 - 14] bounds [9 -- 5] bounds [to 28 from 14] bounds compose [.. (now + 4)] bounds compose [(now/time) - (now/time + 1:0:0)] bounds compose [(now) - (now + 10)] === RANGE RANGE generates a block containing a range of values. Internally it uses BOUNDS, so you use the same dialect as you would there. It also uses COLLECT to gather results. I posted this version, with the dependencies because, while I think standalone functions are a terrific goal, we also need to look at the system as a whole and leverage all we can, building on lower level functions. That helps shrink the amount of code we need to write. It also means that we have to think long and hard about how those lower level functions should work, so they make it easier to express higher level solutions. range: func [ "Returns a block containing a range of values" [catch] bound [block! number! char! money! time! tuple!] {IMPORTANT: Don't use an upper char! bound of #"" (255) -- for now} /by step ; /skip instead of /by? /local a b val ; a/b instead of low/high because they may range high to low. ; could use start/end as well. ] [ set [a b] bounds bound ; If a > b, and they specify a negative step, we don't catch that ; and do anything smart, they just get an empty block back. step: any [step 1] ;make a either date? a [now] [1] if b < a [step: negate abs step] collect val [for v a b step [val: :v]] ] tests: range 1..255 range 255..1 range 10 range [3 10] range [4 to 11] range [6 .. 12] range [8 - 14] range [to 28 from 14] range compose [.. (now + 4)] range compose [(now/time) - (now/time + 1:0:0)] range/by compose [(now/time) - (now/time + 1:0:0)] 0:1 range compose [(now) - (now + 10)] range/by compose [(now) - (now + 10)] 3 === SLICE Why not just use COPY/PART? In some cases that will be a better choice. The big difference is what makes the intent clear. With COPY/PART, you're defining the number of values to copy; with SLICE, you're defining absolute end-points. As we think about how we want these kinds of functions to work, we need to consider what it is we need to express. ; TBD - consider how to handle 2D ranges ; TBD - consider smart evaluation, where dialect terms don't have ; to be escaped. ; TBD - /find refinement to allow slicing-by-value ; What about multiple range specs? [1..4 6..9 12..15] slice: func [ "Returns part of a series." [catch] series bound [block! number! tuple!] /local a b ; a/b instead of low/high because they may range high to low. ; could use start/end as well. ] [ set [a b] bounds bound either a <= b [ copy/part at series a b - a + 1 ][ head reverse copy/part at series a + 1 b - a - 1 ] ] tests: b: [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20] slice b 1..12 slice b 12..1 slice b 10 slice b [3 10] slice b [4 to 11] slice b [12 .. 6] slice b [4 - 8] slice b [to 8 from 4] === EXCERPT EXCERPT is an older function that, using a different dialect, copies values from a block. I included it here for comparison. ; Could also change EXTRACT to accept a block value for WIDTH ; (renamed to, e.g., SPEC). ; The dialect allows you to use commas in the block, but how they ; are interpreted is not how you might think. Coming after a number, ; they are a valid lexical form, but they denote a decimal! rather ; than being seen as a separator, which means you can't use them too ; flexibly. excerpt: func [ {Returns the specified items and/or ranges from the series.} series [series!] offsets [block!] {Offsets of the items to extract; dialected.} /only "return sub-block ranges as blocks" /local emit emit-range rules from* to* index* ; parse vars result ][ emit: func [value] [ either only [append/only result value][append result value] ] emit-range: func [start end] [ start: to integer! start if number? end [end: to integer! end - start + 1] emit either end = 'end [copy at series start][ copy/part at series start end ] ] rules: [ some [ opt 'from set from* number! 'to set to* number! ( emit-range from* to* ) | opt 'from set from* number! 'to 'end (emit-range from* 'end) | 'to set to* number! (emit-range 1 to*) | set index* number! (emit pick series index*) | into rules ] ] ; Return a block. Easy enough for them to REJOIN if they want. result: make block! length? series parse offsets rules result ] tests: b: [1 2 3 4 5 6 7 8 9 10 11 12 13 14] excerpt b [1 3 5] excerpt b [1 3 to 6 8] excerpt/only b [1, 3 to 6, 8] excerpt b [1 [5 to 7] 8] excerpt/only b [1 (from 5 to 7) 8] excerpt b [(to 2) [4 to 6] 8, 10, from 12 to end] excerpt/only b [to 2, 4 to 6, 8, 10, (12 to end)] ; Can't use a comma after 'end excerpt/only b [to 2 to 6 8 10 to end 12 to end] excerpt/only b [to 2, to 6, 8 [10 to end] 12 to end] excerpt/only trim { REBOL is my favorite language } [ to 5, 10 to 11, 13, 14, 15, 22 to end ] excerpt/only to binary! {REBOL is my favorite language} [ to 5, 10 to 11, 13, 14, 15, 22 to end ] === Thoughts What about standard functions like COPY and REMOVE; should they support dialected commands like this? Given the potential for paren! values in paths, would range values be useful there as well, or are they problematic because they hide computational complexity? What should the BOUNDS dialect look like? What are some examples of how we would want to use it? Do we need to consider the concept of generators (e.g. in Icon) and lazy evaluation? Let me know where I wasn't clear in this message; I wrote it quickly and likely didn't explain some things very well. -- Gregg