coding the closing parentheses correctly

[1/8] from: dougedmunds::gmail at: 30-Apr-2008 16:10

Hello, I am trying to code output from a string where: 1.'aN': 'a' represents a container, with 'N' items in it. 2. A container can be an item of another container I am using parentheses to show the container. In my example d, e, f, p, q, r, and s are separate items, not atoms 'def and 'pqrs For example, this input: x: "a2a3defa4pqrs" should produce this output: ((def)(pqrs)) But what I get is this: ((def(pqrs))) All the closing parentheses are at the end. Similarly, x: "a4a4pqrsdef" should produce this: ((pqrs)def) but I get this: ((pqrsdef)) Any suggestions? (Note, I can't do anything to hard-code the closing parenthesis of a container. It has to be based on the length of the container. My actual input is binary, and has a similar structure). == Doug Edmunds ;------------ REBOL[] start: func [x] [ main x] main: func [x] [print "looping" prin x prin " of len x (1) " print length? x while [(length? x) > 0] [ prin x prin " of len x (2) " print length? x fx: to-string first x x: skip x 1 switch/default fx [ "a" [append output "(" len: to-integer first x x: skip x 1 y: copy/part x length? x ; what's left x: tail x ; done with this iteration prin "y " print y main y append output ")"] ] [append output fx] ] ] ; test run output: [] ; should produce ((def)(pqrs)) x: "a2a3defa4pqrs" start x print output

[2/8] from: moliad::gmail::com at: 30-Apr-2008 20:27

here you go Doug, I did a little recursive parse algorythm. ;----------------------------------------------------------------------------------------------- digits: charset "0123456789" item: complement digits load-container: func [ str /local entity cursor output count itm ][ output: copy "" ; recursive parse rule entity: [ [ "a" copy count some digits cursor: ; where do we start parsing next depth ( count: to-integer count append output "(" parse cursor reduce [count entity] append output ")" ) :cursor ;continue where recursive parses left-off ] | [ copy itm item cursor: ; where do we continue parsing after these items ( append output itm ) ] ] parse str entity return output ] ;--------------------------------------------------------------------- probe load-container "a2a3defa4pqrs" == "((def)(pqrs))" probe load-container "a4a4pqrsdef" == "((pqrs)def)" -MAx On Wed, Apr 30, 2008 at 7:10 PM, DougEdmunds <dougedmunds-gmail.com> wrote:

[3/8] from: dougedmunds:gma:il at: 1-May-2008 10:23

Thanks! I am a total noob as to parsing, so I will to get reading up on it. How flexible is parsing if I have alternative types of containers (besides 'a', also 'z'), where an a can be in in a or in z, and z can be in a or z, etc? Here's an example : String input: [C, DE, [C, DE ],{C, DE, [C, DE], {C, DE}}] That more closely approximates my situation: I am feeding a binary string to REBOL. It has to be parsed from left to right. There are lots of types, 2 of which are containers: lists, the square brackets [ ] and tuples, the curly brackets { } . As a binary the input looks like this: #{ 6C000000046B0001436B000244456C000000026B0001436B000244456A68046B 0001436B000244456C000000026B0001436B000244456A68026B0001436B0002 44456A } The data-type/containers all start with '6' 68 -> tuple (container), followed by length 6C -> list (container), followed by length 6A -> null (used to show end of a 'proper' lists) 6B -> string, followed by length, followed by value The datatypes use different number of bytes to represent length: list - 4, tuple - 1, string - 2 Expansion of binary example: #{ 6C00000004 6B000143 6B00024445 6C00000002 6B000143 6B00024445 6A 6804 6B000143 6B00024445 6C00000002 6B000143 6B00024445 6A 6802 6B000143 6B00024445 6A } Details: 6C 00000004 (list of 4 6B 0001 43 first, a string, length 1, (value x43 = "C") 6B 0002 4445 second, a string, length 2 (value x4445 = "DE") 6C 00000002 third, a list of 2 6B 0001 43 first, a string 6B 0002 4445 second, a string 6A end list of 2 68 04 fourth, a tuple of 4 6B 0001 43 first, a string 6B 0002 4445 second, a string 6C 00000002 third, a list of 2 6B 0001 43 first, a string 6B 0002 4445 second, a string 6A end list of 2 68 02 fourth, a tuple of 2 6B 0001 43 first, a string 6B 0002 4445 second, a string 6A end main list of 4 Maxim Olivier-Adlhoch wrote:

[4/8] from: moliad:gm:ail at: 1-May-2008 16:06

parse is very flexible. in your example, I used parse recursively (calling parse within parse rules), because the counts are given within the data, its easier to do it this way. We could have done it differently using just one parse call, a stack, and words for the counts, but then, its pretty complicated to make. one thing to know is that parse only works on string and block types, so you have to convert your binary into a string first. are you parsing EDI data? another detail is that rebol only supports signed 32 bit ints, so you are effectively limited to a length of 31 bits for your lists. here is the code to solve your specific needs if you look at the difference between my previous example and this one, I think you'll be able to see how to expand on this one. note that the code below expects input to be valid and does no kind of error checking whatsoever. if input data is bad, its possible that the parser will stop, its also possible that a voluntary malign construction would let parser go into infinite loops. This is not to say that I see a place where such an infinite loop is possibe, only that sometimes, the parser ends up doing so when no time is spent making sure the rule is bullet-proof. if you are acting on unverified user-input, you might want to verify things like the count values, or other anomalies, and throw an error with make error! HTH :-) -MAx ;---------------------------------8<-------------------------------- TUPLE: to-string #{68} LIST: to-string #{6C} NULL: to-string #{6A} STRING: to-string #{6B} load-container: func [ str /local entity cursor output count data ][ output: copy "" ; recursive parse rule entity: [ [ LIST copy count 4 skip cursor: ; where do we start parsing next depth ( count: to-integer to-binary count append output "[" ) ; the count for the entity rule was set dynamically count entity ( remove/part skip tail output -2 2 ; remove trailing ", " we are at end of a container append output "], " ) :cursor ;continue where recursive parses left-off NULL cursor: ] | [ TUPLE copy count skip cursor: ; where do we start parsing next depth ( count: to-integer to-binary count append output "{" ) ; the count for the entity rule was set dynamically count entity ( remove/part skip tail output -2 2 ; remove trailing ", " we are at end of a container append output "}, " ) :cursor ;continue where recursive parses left-off ] | [ STRING copy count 2 skip ( ; here we change the count for the rule dynamically count: to-integer to-binary count ) copy data count skip cursor: ; where do we continue parsing after these items ( append output join data ", " ) ] ] parse/all str entity remove/part skip tail output -2 2 ; remove trailing ", " we are at end of a container return output ] load-container to-string #{ 6C000000046B0001436B000244456C000000026B0001436B000244456A68046B 0001436B000244456C000000026B0001436B000244456A68026B0001436B0002 44456A} == "[C, DE, [C, DE], {C, DE, [C, DE], {C, DE}}]" ;---------------------------------8<-------------------------------- On Thu, May 1, 2008 at 1:23 PM, DougEdmunds <dougedmunds-gmail.com> wrote:

[5/8] from: moliad::gmail::com at: 1-May-2008 16:33

hi Doug, note that in my last example, I tried something different and it worked. instead of calling parse recursively, I just dynamically set the entity count within the rule and let parse's internal recursivity do the rest. By calling the entity rule within the entity rule, parse effectively continues looking for an entity then and there. I don't know why I didn't do this in the first example. this, is the correct way of using parse. The reason is simple, its faster. Care must be taken though, because parse doesn't reallocate variables or push them to a stack when using recursive rules. You must do so yourself, when its needed, by adding a bit of code to the start and end of rules. This is usually necessary when building data trees made up of blocks for example, where every layer, must push the "current" block to a stack, allocate a new one and use this new one as the current block for children (the recursive rule) at that point. when children return, you pop the block and continue using "your" current block. this said, its advanced usage and seldom needed. other ways to build data trees, are simpler, like in your example, where we could have replaced the tuples with a set of parenthesis "( )" forego of the colons "," and then used load on the string. giving us a native rebol block with block! word! and paren! within. -MAx On Thu, May 1, 2008 at 4:06 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:

[6/8] from: santilli:gabriele:gm:ail at: 1-May-2008 23:27

On Thu, May 1, 2008 at 10:06 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:

> one thing to know is that parse only works on string and block types, > so you have to convert your binary into a string first.

PARSE works with binary! too - in fact, in R2 string! and binary! are almost indistinguishable. Regards, Gabriele.

[7/8] from: moliad:g:mail at: 1-May-2008 17:47

hi, per my usage, parse converts the binary on entry... and expects strings or block rules... so trying to use binaries directly is quite troublesome, and prone to errors ex: 1)

>> a: #{aabbcc}

== #{AABBCC}

>> parse a ""

== ["=AA=BB=CC"] 2)

>> a: #{aabbcc}

== #{AABBCC}

>> parse a #{BB}

** Script Error: parse expected rules argument of type: block string none ** Near: parse a #{BB} 3)

>> parse/all a [#{AA} copy aa #{BB} (probe aa) #{CC}]

=BB !!! anyone would expect #{BB} When I was doing my parse-driven binary servers, nothing worked when trying to keep the input as a binary, rules would not react as expected, so I ended up converting everything to strings, using labels (words) for control characters and then all worked fine. but yes, I agree, strings and binary practically identical otherwise. -MAx On Thu, May 1, 2008 at 5:27 PM, Gabriele Santilli <santilli.gabriele-gmail.com> wrote:

[8/8] from: santilli::gabriele::gmail::com at: 2-May-2008 1:58

On Thu, May 1, 2008 at 11:47 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:

> per my usage, parse converts the binary on entry...

converts can be a confusing word. "interprets" may be more correct. PARSE does not make a distinction in R2, so binary! is just the same a string! for it. Regards, Gabriele.