[REBOL] Parse Question: html-to-text conversion help.
From: reboler::programmer::net at: 1-May-2002 8:19
I have a working html-to-text converter, but would like to add the links to the text
as well.
The following parse rule works well to extract only the links...
link: [some [thru "<a href=" copy lnk to ">" (append text lnk)]]
... but is there any way to add this to the converter below?
I'm having trouble since the html-rules already contain ["<" thru ">"].
*** html-to-text converter ***
The following code is modified from the Core/Parse docs and the %texthtml.r text-to-html
converter...
html-text-extractor: context [
text: make string! 256
html-rules: [
to "<" some [["<" thru ">"] | copy txt to "<" (append text txt)]
]
symbols: [
"&" "&"
"<" "<"
">" ">"
""" {"}
]
extract-text: func [
{Extracts text from an HTML web page.
Usage extract-text read http://www.rebol.com/index.html
extract-text read %license.html
}
page [string!]
][
clear text
parse/all page [html-rules]
foreach [symbol char] symbols [
replace/all text :symbol :char
]
]
]