coding the closing parentheses correctly
[1/8] from: dougedmunds::gmail at: 30-Apr-2008 16:10
Hello,
I am trying to code output from a string where:
1.'aN': 'a' represents a container, with 'N' items in it.
2. A container can be an item of another container
I am using parentheses to show the container.
In my example d, e, f, p, q, r, and s are
separate items, not atoms 'def and 'pqrs
For example, this input:
x: "a2a3defa4pqrs"
should produce this output:
((def)(pqrs))
But what I get is this:
((def(pqrs)))
All the closing parentheses are at the end.
Similarly, x: "a4a4pqrsdef"
should produce this: ((pqrs)def)
but I get this: ((pqrsdef))
Any suggestions?
(Note, I can't do anything to hard-code the closing parenthesis of a
container. It has to be based on the length of the container. My
actual input is binary, and has a similar structure).
== Doug Edmunds
;------------
REBOL[]
start: func [x]
[ main x]
main: func [x]
[print "looping" prin x prin " of len x (1) " print length? x
while [(length? x) > 0]
[ prin x prin " of len x (2) " print length? x
fx: to-string first x
x: skip x 1
switch/default fx [
"a" [append output "("
len: to-integer first x
x: skip x 1
y: copy/part x length? x ; what's left
x: tail x ; done with this iteration
prin "y " print y
main y
append output ")"]
]
[append output fx]
]
]
; test run
output: []
; should produce ((def)(pqrs))
x: "a2a3defa4pqrs"
start x
print output
[2/8] from: moliad::gmail::com at: 30-Apr-2008 20:27
here you go Doug,
I did a little recursive parse algorythm.
;-----------------------------------------------------------------------------------------------
digits: charset "0123456789"
item: complement digits
load-container: func [
str
/local entity cursor output count itm
][
output: copy ""
; recursive parse rule
entity: [
[
"a" copy count some digits
cursor: ; where do we start parsing next depth
(
count: to-integer count
append output "("
parse cursor reduce [count entity]
append output ")"
)
:cursor ;continue where recursive parses left-off
] |
[
copy itm item
cursor: ; where do we continue parsing after these items
(
append output itm
)
]
]
parse str entity
return output
]
;---------------------------------------------------------------------
probe load-container "a2a3defa4pqrs"
== "((def)(pqrs))"
probe load-container "a4a4pqrsdef"
== "((pqrs)def)"
-MAx
On Wed, Apr 30, 2008 at 7:10 PM, DougEdmunds <dougedmunds-gmail.com> wrote:
[3/8] from: dougedmunds:gma:il at: 1-May-2008 10:23
Thanks! I am a total noob as to parsing, so I will to get reading
up on it.
How flexible is parsing if I have alternative types of
containers (besides 'a', also 'z'), where an a can be in
in a or in z, and z can be in a or z, etc?
Here's an example :
String input:
[C, DE, [C, DE ],{C, DE, [C, DE], {C, DE}}]
That more closely approximates my situation: I am feeding a binary
string to REBOL. It has to be parsed from left to right. There are
lots of types, 2 of which are containers:
lists, the square brackets [ ]
and tuples, the curly brackets { } .
As a binary the input looks like this:
#{
6C000000046B0001436B000244456C000000026B0001436B000244456A68046B
0001436B000244456C000000026B0001436B000244456A68026B0001436B0002
44456A
}
The data-type/containers all start with '6'
68 -> tuple (container), followed by length
6C -> list (container), followed by length
6A -> null (used to show end of a 'proper' lists)
6B -> string, followed by length, followed by value
The datatypes use different number of bytes to represent
length: list - 4, tuple - 1, string - 2
Expansion of binary example:
#{
6C00000004
6B000143
6B00024445
6C00000002 6B000143 6B00024445 6A
6804
6B000143
6B00024445
6C00000002 6B000143 6B00024445 6A
6802
6B000143 6B00024445
6A
}
Details:
6C 00000004 (list of 4
6B 0001 43 first, a string, length 1, (value x43 = "C")
6B 0002 4445 second, a string, length 2 (value x4445 = "DE")
6C 00000002 third, a list of 2
6B 0001 43 first, a string
6B 0002 4445 second, a string
6A end list of 2
68 04 fourth, a tuple of 4
6B 0001 43 first, a string
6B 0002 4445 second, a string
6C 00000002 third, a list of 2
6B 0001 43 first, a string
6B 0002 4445 second, a string
6A end list of 2
68 02 fourth, a tuple of 2
6B 0001 43 first, a string
6B 0002 4445 second, a string
6A end main list of 4
Maxim Olivier-Adlhoch wrote:
[4/8] from: moliad:gm:ail at: 1-May-2008 16:06
parse is very flexible.
in your example, I used parse recursively (calling parse within parse
rules), because the counts are given within the data, its easier to do
it this way. We could have done it differently using just one parse
call, a stack, and words for the counts, but then, its pretty
complicated to make.
one thing to know is that parse only works on string and block types,
so you have to convert your binary into a string first.
are you parsing EDI data?
another detail is that rebol only supports signed 32 bit ints, so you
are effectively limited to a length of 31 bits for your lists.
here is the code to solve your specific needs if you look at the
difference between my previous example and this one, I think you'll be
able to see how to expand on this one.
note that the code below expects input to be valid and does no kind of
error checking whatsoever. if input data is bad, its possible that the
parser will stop, its also possible that a voluntary malign
construction would let parser go into infinite loops. This is not to
say that I see a place where such an infinite loop is possibe, only
that sometimes, the parser ends up doing so when no time is spent
making sure the rule is bullet-proof.
if you are acting on unverified user-input, you might want to verify
things like the count values, or other anomalies, and throw an error
with make error!
HTH :-)
-MAx
;---------------------------------8<--------------------------------
TUPLE: to-string #{68}
LIST: to-string #{6C}
NULL: to-string #{6A}
STRING: to-string #{6B}
load-container: func [
str
/local entity cursor output count data
][
output: copy ""
; recursive parse rule
entity: [
[
LIST copy count 4 skip
cursor: ; where do we start parsing next depth
(
count: to-integer to-binary count
append output "["
)
; the count for the entity rule was set dynamically
count entity
(
remove/part skip tail output -2 2 ; remove trailing ", " we are at
end of a container
append output "], "
)
:cursor ;continue where recursive parses left-off
NULL
cursor:
] |
[
TUPLE copy count skip
cursor: ; where do we start parsing next depth
(
count: to-integer to-binary count
append output "{"
)
; the count for the entity rule was set dynamically
count entity
(
remove/part skip tail output -2 2 ; remove trailing ", " we are at
end of a container
append output "}, "
)
:cursor ;continue where recursive parses left-off
] |
[
STRING copy count 2 skip (
; here we change the count for the rule dynamically
count: to-integer to-binary count
)
copy data count skip
cursor: ; where do we continue parsing after these items
(
append output join data ", "
)
]
]
parse/all str entity
remove/part skip tail output -2 2 ; remove trailing ", " we are at
end of a container
return output
]
load-container to-string #{
6C000000046B0001436B000244456C000000026B0001436B000244456A68046B
0001436B000244456C000000026B0001436B000244456A68026B0001436B0002
44456A}
== "[C, DE, [C, DE], {C, DE, [C, DE], {C, DE}}]"
;---------------------------------8<--------------------------------
On Thu, May 1, 2008 at 1:23 PM, DougEdmunds <dougedmunds-gmail.com> wrote:
[5/8] from: moliad::gmail::com at: 1-May-2008 16:33
hi Doug,
note that in my last example, I tried something different and it worked.
instead of calling parse recursively, I just dynamically set the
entity count within the rule and let parse's internal recursivity do
the rest. By calling the entity rule within the entity rule, parse
effectively continues looking for an entity then and there.
I don't know why I didn't do this in the first example.
this, is the correct way of using parse. The reason is simple, its faster.
Care must be taken though, because parse doesn't reallocate variables
or push them to a stack when using recursive rules. You must do so
yourself, when its needed, by adding a bit of code to the start and
end of rules. This is usually necessary when building data trees made
up of blocks for example, where every layer, must push the "current"
block to a stack, allocate a new one and use this new one as the
current block for children (the recursive rule) at that point. when
children return, you pop the block and continue using "your" current
block. this said, its advanced usage and seldom needed.
other ways to build data trees, are simpler, like in your example,
where we could have replaced the tuples with a set of parenthesis "(
)" forego of the colons "," and then used load on the string.
giving us a native rebol block with block! word! and paren! within.
-MAx
On Thu, May 1, 2008 at 4:06 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:
[6/8] from: santilli:gabriele:gm:ail at: 1-May-2008 23:27
On Thu, May 1, 2008 at 10:06 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:
> one thing to know is that parse only works on string and block types,
> so you have to convert your binary into a string first.
PARSE works with binary! too - in fact, in R2 string! and binary! are
almost indistinguishable.
Regards,
Gabriele.
[7/8] from: moliad:g:mail at: 1-May-2008 17:47
hi,
per my usage, parse converts the binary on entry... and expects
strings or block rules... so trying to use binaries directly is quite
troublesome, and prone to errors
ex:
1)
>> a: #{aabbcc}
== #{AABBCC}
>> parse a ""
== ["=AA=BB=CC"]
2)
>> a: #{aabbcc}
== #{AABBCC}
>> parse a #{BB}
** Script Error: parse expected rules argument of type: block string none
** Near: parse a #{BB}
3)
>> parse/all a [#{AA} copy aa #{BB} (probe aa) #{CC}]
=BB
!!! anyone would expect #{BB}
When I was doing my parse-driven binary servers, nothing worked when
trying to keep the input as a binary, rules would not react as
expected, so I ended up converting everything to strings, using labels
(words) for control characters and then all worked fine.
but yes, I agree, strings and binary practically identical otherwise.
-MAx
On Thu, May 1, 2008 at 5:27 PM, Gabriele Santilli
<santilli.gabriele-gmail.com> wrote:
[8/8] from: santilli::gabriele::gmail::com at: 2-May-2008 1:58
On Thu, May 1, 2008 at 11:47 PM, Maxim Olivier-Adlhoch <moliad-gmail.com> wrote:
> per my usage, parse converts the binary on entry...
converts
can be a confusing word. "interprets" may be more correct.
PARSE does not make a distinction in R2, so binary! is just the same a
string! for it.
Regards,
Gabriele.