Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Parsing optional tags

 [1/6] from: geza67:freestart:hu at: 4-Nov-2001 16:00


Hello REBOLers, Beyond View-related stuff, IMHO parsing is the most underdocumented but most powerful feature of REBOL. Now I am stucking with a rather trivial problem : I want to collect texts embedded in tags: <b>some_text</b> or <b><i>some_text</i></b> Finding the beginning of the text is obvious, but matching the closing tags is NOT symmetrical :-( - as I thought: parse html [any [ thru <b> [<i> | none] copy Collect to [</i> | none] </b> (append Collects Collect) ]] ... does not work as expected: Collects is full of 'none values. Could you help me, please? -- Best regards, Geza Lakner MD mailto:[geza67--freestart--hu]

 [2/6] from: lmecir:mbox:vol:cz at: 4-Nov-2001 17:06


Hi Geza, how about: i-tag: [<i> copy collect to </i>] not-i-tag: [copy collect to </b>] b-tag: [[i-tag | not-i-tag] </b> (append collects collect)] htmlrule: [[to <b> | to end] [<b> b-tag htmlrule | none]] parse html htmlrule Look out! not tested Cheers Ladislav

 [3/6] from: lmecir:mbox:vol:cz at: 4-Nov-2001 18:05


Hi myself, I should have tested it, here is an improved version: i-tag: [copy collect to </i> </i>] not-i-tag: [copy collect to </b>] b-tag: [[<i> (following: i-tag) | (following: not-i-tag)] following (append collects collect) </b>] htmlrule: [[thru <b> (follow: [b-tag htmlrule]) | to end (follow: [none])] follow] parse html htmlrule Cheers Ladislav

 [4/6] from: office::thousand-hills::net at: 4-Nov-2001 12:03


I have used this reading by lines: parse [ <B><I> ] parse [ </B></> ] John At 04:00 PM 11/4/2001 +0100, you wrote:

 [5/6] from: geza67:freestart:hu at: 4-Nov-2001 23:53


Hello Ladislav,
> i-tag: [copy collect to </i> </i>] > not-i-tag: [copy collect to </b>]
<<quoted lines omitted: 3>>
> follow] > parse html htmlrule
Thank you for the working version but I just can't believe it has to be so complicated - building a (in a way) complete syntax to strip off two lousy HTML-tags ... :-( If this is the more-or-less only "obvious" way to do it in the REBOL parse dialect, I'll better stick to the old-fashioned way and build a regex and feed it through sed/awk/vim or something like that. -- Best regards, Geza mailto:[geza67--freestart--hu]

 [6/6] from: lmecir:mbox:vol:cz at: 5-Nov-2001 0:35


Hi Geza, it doesn't have to be so complicated, my version probably checks more things than you needed (check if the tags are balanced etc.). There obviously is a simpler way how to do this. Hello Ladislav,
> i-tag: [copy collect to </i> </i>] > not-i-tag: [copy collect to </b>] > b-tag: [[<i> (following: i-tag) | (following: not-i-tag)] following
(append
> collects collect) </b>] > htmlrule: [[thru <b> (follow: [b-tag htmlrule]) | to end (follow: [none])] > follow] > parse html htmlrule
Thank you for the working version but I just can't believe it has to be so complicated - building a (in a way) complete syntax to strip off two lousy HTML-tags ... :-( If this is the more-or-less only "obvious" way to do it in the REBOL parse dialect, I'll better stick to the old-fashioned way and build a regex and feed it through sed/awk/vim or something like that. -- Best regards, Geza mailto:[geza67--freestart--hu]

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted