[REBOL] Re: looking for a function...
From: al:bri:xtra at: 14-Nov-2000 23:42
Donald wrote:
> Andrew: the result is:
>
> >> do striptags.r WRITE %test.txt striptags READ %test.html
> ** Script Error: striptags.r has no value.
> ** Where: do striptags.r WRITE %test.txt striptags
I forgot to write the "%". It should have read:
do %striptags.r WRITE %test.txt striptags READ %test.html
Donald wrote:
> ...still have not removed the two odd lines of HTML, though.
Try this version. Note that I've added in:
"//-->" ""
"-->" ""
in multi-replace.
If there's any other strange stuff in there, just do as I did and extend
Striptag yourself, to remove those characters. Why not give it a go?
[
REBOL [
Title: "HTML Tag Stripper"
Date: 20-Jul-1999
Author: "Bohdan Lechnowsky"
Email: [bo--rebol--com]
Purpose: {
To strip off HTML tags leaving only text behind
}
]
striptags: func [page /local text end] [
multi-replace: func [
{Replaces multiple items in a file}
pg [series!] {The series to replace items in}
blk [block!] {A block of search and replace elements}
][
foreach [srch rplc] blk [
replace/all pg srch rplc
]
]
;table of tags and more suitable ASCII characters
page: multi-replace trim/lines page [
"<TITLE>" "TITLE: "
"</TITLE>" "^/"
" " " "
"<TD>" "^-|^-"
"</TD>" "^-|^-"
"^-|^-^-|^-" "^-|^-"
"<TR>" " "
"</TR>" "^/"
"<TABLE" "^/<"
"</TABLE>" "^/"
"<P>" "^/"
"<LI>" "^/^-· "
"<BR>" "^/"
" " " "
">" ">"
"<" "<"
"©" "(c)"
"&" "&"
""" {"}
"</H1>" "^/"
"</H2>" "^/"
"</H3>" "^/"
"</H4>" "^/"
"</H5>" "^/"
"</H6>" "^/"
"<HR" "^/----------^/<"
"//-->" ""
"-->" ""
]
text: copy ""
append page "<"
append text copy/part page find page "<"
while [page: find/tail page ">"] [
if (first page) <> #"<" [
if found? end: find page "<" [
append text copy/part page end
]
]
]
return append text "^/"
]
]
I hope that helps!
Andrew Martin
ICQ: 26227169
http://members.nbci.com/AndrewMartin/