Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: How to extract content of HTML table?

From: mike::yaunish::shaw::ca at: 29-Aug-2006 17:12

I have run into the same issue. My solution has been to use a small function called delim-extract as shown below. I have found if I can break the pages I am parsing into chunks first then if something changes I am able to modify the delimiters I am using quite easily later. It's not pretty but it works. Probably not terribly fast either. delim-extract: func [ "returns a block of every string found that is surrounded by defined delimeters" source-str [string!] "Text string to extract from." left-delim [string!] "Text string delimiting the left side of the desired string." right-delim [string!] "Text string delimiting the right side of the desired string." /include-delimiters "Returned extractions will include the delimiters" /use-head "Head of string is used as left delimiter" /first "Return the first match found only" /local tags tag ] [ tag: make string! tags: make block! [] if use-head [ either include-delimiters [ parse source-str [ copy tag thru right-delim ] insert head tag left-delim ][ parse source-str [ copy tag to right-delim ] ] append tags tag ] either include-delimiters [ parse source-str [some [ [ thru left-delim copy tag to right-delim ] (append tags rejoin [ left-delim tag right-delim] )]] ][ parse source-str [some [ [ thru left-delim copy tag to right-delim ] (append tags tag)]] ] either first [ either ((length? tags) = 0 ) [ return none ][ return tags/1 ] ][ return tags ] ] test-extract: func [] [ page: read title: delim-extract/first page "<title>" "</title>" print [ "The title is = " title ] cgi-str: "Query=REBOL&SearchView=VERBOSE&MaxResults=10&Sort=1" cgi-variable-names: delim-extract/use-head cgi-str "&" "=" print [ "cgi-variable-names = " cgi-variable-names ] ]