[REBOL] Newlines
From: hijim:pronet at: 8-Nov-2001 19:40
Newlines
Hello Anton and Joel,
Thanks for the help. The code below works perfectly on the pages I've tested, though
it takes a second or two on long pages. It strips all the html and newlines and extraneous
whitepsace.
Joel was right about my original problem. I had a space between two newlines, so now
I'm removing those too.
I hope this gets through. The last two emails I've sent came back as undeliverable. I'm
sending this one via Rebol instead of Netscape.
Jim
btn silver / 1.4 "Strip html" [
marked-up: copy my-area/text
parse my-area/text
[any [to "<style" begin: thru "</style>" ending: (remove/part begin ending) :begin]
]
parse my-area/text
[any [to "<script" begin: thru "</script>" ending: (remove/part begin ending) :begin]
]
foreach [search-string replace-string][
"<li>" "* " "</" "<" "<p>" "^/"
"<h1>" "^/^/" "<h2>" "^/^/" "<h3>" "^/^/" "<h4>" "^/^/"
"^-" " " "<hr>" "^/----------------------------------^/"
" " " " "©" "(C) " """ {"} "&" "&"][
replace/all my-area/text search-string replace-string
]
parse my-area/text
[any [to "<" begin: thru ">" ending: (remove/part begin ending) :begin]
]
foreach [search-string][" " "^-" " ^/" "^/^/^/"] [
while [p: find my-area/text search-string] [remove p]
]
replace/all my-area/text ">" ">"
replace/all my-area/text "<" "<"
trim my-area/text
show my-area get-lines
]
btn silver / 1.4 "Restore" [
if marked-up = "" [return ]
my-area/text: marked-up
show my-area get-lines
]