[REBOL] Re: How to extract content of HTML table?
From: tim-johnsons:web at: 28-Aug-2006 14:21
* Jos=E9 Antonio <joseantoniorocha-gmail.com> [060828 12:18]:
> No prize for me.
>
> load/markup just made a huge block whit HTML page content. Not useful,
> as table elements is not nested in blocks
'Should' be *very* useful :-) 'cuz I use it all the time -
what you want to do is test datatypes for strings, also
what is a bit counterintuitive is that you can test a tag
for a string, as in
find <table> "table" ;; beginning of table, set some boolean
;; processing-table: true
OR
find </table> "/table" ;; end of table, set some boolean
;; processing-table: false
and then:
;; untested, incomplete code!
foreach element load/markup some-document[
if all[
tag? element
find element "/table"
][processing-table: false]
if all[
tag? element
find element "table"
][processing-table: true]
if all[
processing-table
string? element][
do-something-with-string
]
]
> I gess this can be complished with parse function, but parse is very comple
> x.
>
> Anyone already make a function that do this task, extract table
> content as, say, comma separated values or nested blocks?
>
> On 8/28/06, Tim Johnson <tim-johnsons-web.com> wrote:
> >
> > * Jos=E9 Antonio <joseantoniorocha-gmail.com> [060828 11:27]:
> > >
> > > There are a fast and easy way to do that?
> >
> > To start:
> > How about load/markup?
> > That will convert HTML to a block of tags and strings
> >
> > HTH
> > tim
> >
> > > --=3D20
> > > nome: "Jos=3DE9 Antonio Meira da Rocha" tratamento: "Prof. MS.
> > > atividade:
Consultoria e treinamento em jornalismo impresso e online"
> > > googletalk: MSN: email: joseantoniorocha-gmail.com
> > > site: http://meiradarocha.jor.br
> > > ICQ: 658222 AIM: "meiradarochajor"
> > > Skype: yahoo: "meiradarocha_jor"
> > > --
> > > To unsubscribe from the list, just send an email to
> > > lists at rebol.com with unsubscribe as the subject.
> >
> > --
> > Tim Johnson <tim-johnsons-web.com>
> > http://www.alaska-internet-solutions.com
> > --
> > To unsubscribe from the list, just send an email to
> > lists at rebol.com with unsubscribe as the subject.
> >
> >
>
> --
> nome: "Jos=E9 Antonio Meira da Rocha" tratamento: "Prof. MS."
> atividade: "Consultoria e treinamento em jornalismo impresso e online"
> googletalk: MSN: email: joseantoniorocha-gmail.com
> site: http://meiradarocha.jor.br
> ICQ: 658222 AIM: "meiradarochajor"
> Skype: yahoo: "meiradarocha_jor"
> --
> To unsubscribe from the list, just send an email to
> lists at rebol.com with unsubscribe as the subject.
--
Tim Johnson <tim-johnsons-web.com>
http://www.alaska-internet-solutions.com