Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: Parse and and recursion local variables?

From: moliad:g:mail at: 17-Mar-2007 19:09

hi Pekr, I have not completely followed the code part, as its complex but the recursion issue is simply by the very nature of parse... parse rules are not a stacked function calls. they are branches of execution with automatic series pointer rollback on error. so as you are traversing a series, you really only jump and come back ... no stack push... as you have no variables to push on the parse. if rebol did an explicit copy of the parse rules (thus localizing each rule at each instances), I can tell you that memory consumption and speed drop would not only be dramatic, but would render the parser unusable in any large dataset handling. the current parser is a stream analyser.. where you decide what to do next... the fact that the stream has a graph, tree, or recursive data organisation is not parse's fault. in the current implementation, we are able to parse 700MB files using a 15000 lines of parse rules (cause I know someone who has such a setup) and it screams!... if we added any kind of copy... any real usage would just crawl crash rebol and need GBs RAM. as we speak I have a 1300 line parse rule which handles several kb of string in 0.02 seconds (or less). So, this being said I know parse is a bitch to use at first... hell I gave up trying each year for the last 7 years...but for some reason, I gave it another try (again) in the last months... and well, I finally "GOT" it. its a strange process, but it suddenly becomes so clear all becomes obvious. The only thing I can say (from my own experience) ... don't give up... really do go to the end of your implementation and eventually you might GET it too. ;-) The only thing I can say about parse, is that its usually MUCH easier (and faster too) to parse the original string data and construct your data set AS a loadable string. in this way, you just brease through the data linearily (VERY FAST, no stack issues) and append all nesting in the loadable string as you go. simple generic html loading example: hope this helps -MAx ;--------------------------------------------------- rebol [] ;- ;- RULES html: context [ output: "" ;protect the global space data: none attr: none val: none attrstr: none alphabet: "abcdefghijklmnopqrstuvwxyz" not-quotes: complement charset {"} alpha: union charset alphabet charset 1234567890abcdefghijklmnopqrstuvwxyz_ABCDEFGHIJKLMNOPQRSTUVWXYZ nalpha: complement alpha path: not-quotes ;union alpha charset "%-+:&=./\" space: charset [#" " #"^/" #"^-"] spaces: [any space] attribute: [ spaces copy attr some alpha {=} spaces [ copy val some alpha | {"} copy val any not-quotes {"}] (append output rejoin [attr " {" val "}"]) ] in-tag: [ "<" copy data some alpha ( append output join data "[^/" ) spaces any attribute spaces ">" ] out-tag: [ ["</" thru ">"] ( append output "^/]") ] content: [copy data to "<" (if all [data not empty? trim copy data] [append output rejoin [ "{" data "}"]]) ] ;href: [ spaces {href="} copy attrs any path {"}] ;link-tag: ["<A" spaces some [href (append parsed-links attrs ) | attribute ] ">"] ;[[copy ref-url href (print ref-url)] | attribute ]] rule: [some [content [ out-tag | in-tag ] ] ] ] parse/all {<html> <body> <h3 >tada</h3><p><FONT color="#000000" > there you go :-)</FONT></p> </body> </html>} html/rule probe load html/output html-blk: load html/output ; XPATH anyone ;-) probe html-blk/html/body/p/font/color ask "..." On 3/17/07, Petr Krenzelok <> wrote: