looking for a function...

[1/23] from: gchiu::compkarori::co::nz at: 7-Nov-2000 16:45

Has anyone got a function that strips out all the html from a page leaving just the text behind? -- Graham Chiu

[2/23] from: gchiu:compkarori at: 7-Nov-2000 19:03

On Tue, 07 Nov 2000 16:45:49 +1300 "Graham Chiu" <[gchiu--compkarori--co--nz]> wrote:

> Has anyone got a function that strips out all the html > from > a page leaving just the text behind?

Never mind. Found one.

[3/23] from: joel:neely:fedex at: 7-Nov-2000 9:59

Hi, Graham, [rebol-bounce--rebol--com] wrote:

> Has anyone got a function that strips out all the html from > a page leaving just the text behind? >

Given the following: load-text-only: func [where [file! url!] /local text] [ text: make string! 10000 foreach item load/markup where [ if string? item [ append text item ] ] text ] and a %test.html file containing: <html> <head> <title>Test Page</title> </head> <body> <h1>Test Page</h1> <p>Here is a paragraph.</p> <p>Here is another one</p> <blockquote>Common sense is seldom both.</blockquote> </body> </html> you can say:

>> load-text-only %test.html

== { Test Page Test Page Here is a paragraph. Here is another one Common sense is seldom both. } Dealing with the surplus whitespace is "left as an exercise for the reader" ;-) -jn- -- ; Joel Neely [joel--neely--fedex--com] 901-263-4460 38017/HKA/9677 REBOL [] foreach [order string] sort/skip reduce [ true "!" false head reverse "rekcah" none "REBOL " prin "Just " "another " ] 2 [prin string] print ""

[4/23] from: petr:krenzelok:trz:cz at: 7-Nov-2000 18:09

----- Original Message ----- From: Joel Neely <[joel--neely--fedex--com]> To: <[rebol-list--rebol--com]> Sent: Tuesday, November 07, 2000 4:59 PM Subject: [REBOL] Re: looking for a function...

> Hi, Graham, > [rebol-bounce--rebol--com] wrote:

<<quoted lines omitted: 35>>

> Dealing with the surplus whitespace is "left as an exercise for > the reader" ;-)

I will better show you way cooler aproach ;-)) text: copy "" parse load/markup http://Www.rebol.com [some [tag! | set str string! (insert tail text join trim/lines str either empty? str [""][newline])]] -pekr- 'da rebolman ;-)

[5/23] from: gchiu:compkarori at: 8-Nov-2000 9:08

On Tue, 7 Nov 2000 18:09:04 +0100 "Petr Krenzelok" <[petr--krenzelok--trz--cz]> wrote:

> text: copy "" > parse load/markup http://Www.rebol.com [some [tag! | set > str string! (insert > tail text join trim/lines str either empty? str > [""][newline])]]

Thanks guys! Is load/markup a new refinement to load? Hadn't noticed it before. -- Graham Chiu

[6/23] from: joel:neely:fedex at: 7-Nov-2000 15:55

[rebol-bounce--rebol--com] wrote:

> text: copy "" > parse load/markup http://Www.rebol.com [some [tag! | set str string! (insert > tail text join trim/lines str either empty? str [""][newline])]] >

Outstanding! -jn- -- ; Joel Neely [joel--neely--fedex--com] 901-263-4460 38017/HKA/9677 REBOL [] foreach [order string] sort/skip reduce [ true "!" false head reverse "rekcah" none "REBOL " prin "Just " "another " ] 2 [prin string] print ""

[7/23] from: rishi:picostar at: 9-Nov-2000 21:23

someone told me next quarter...but I think he was joking! BTW, next assignment is up... rishi Previously, you (Graham Chiu) wrote:

[8/23] from: bo:rebol at: 10-Nov-2000 10:19

Graham, Here is one I hacked together over a year ago. It tries to keep some of the formatting features of the HTML, but only on a very basic level. EXAMPLE USAGE: text: striptags read http://www.rebol.com Have fun! -Bo On 9-Nov-2000/21:23:39, [rishi--picostar--com] wrote:

>someone told me next quarter...but I think he was joking! BTW, next assignment is up... >rishi

<<quoted lines omitted: 14>>

>[rebol-request--rebol--com] with "unsubscribe" in the >subject, without the quotes.

-- Bohdan "Bo" Lechnowsky REBOL Adventure Guide REBOL Technologies 707-467-8000 (http://www.rebol.com) The Official Source for REBOL Books (http://www.REBOLpress.com) -- Binary/unsupported file stripped by Listar -- -- Type: application/octet-stream -- File: striptags.r

[9/23] from: bo:rebol at: 10-Nov-2000 10:57

Bit by our own list...here it is in plain text! -Bo --Striptags-- REBOL [ Title: "HTML Tag Stripper" Date: 20-Jul-1999 Author: "Bohdan Lechnowsky" Email: [bo--rebol--com] Purpose: { To strip off HTML tags leaving only text behind } ] striptags: func [page /local text end] [ multi-replace: func [ {Replaces multiple items in a file} pg [series!] {The series to replace items in} blk [block!] {A block of search and replace elements} ][foreach [srch rplc] blk [replace/all pg srch rplc]] ;table of tags and more suitable ASCII characters page: multi-replace trim/lines page [ "<TITLE>" "TITLE: " "</TITLE>" " " " " "<TD>" " | " "</TD>" " | " " | | " " | " "<TR>" " " "</TR>" " <TABLE" " <" "</TABLE>" " <P>" " <LI>" " � " "<BR>" "  " " " ">" ">" "<" "<" "©" "(c)" "&" "&" """ {"} "</H1>" " </H2>" " </H3>" " </H4>" " </H5>" " </H6>" " <HR" " ---------- <" ] text: copy "" append page "<" append text copy/part page find page "<" while [page: find/tail page ">"] [ if (first page) <> #"<" [ if found? end: find page "<" [ append text copy/part page end ] ] ] return append text " ] --End Striptags-- On 10-Nov-2000/10:19:20-7:00, [bo--rebol--com] wrote:

>Graham, >Here is one I hacked together over a year ago. It tries to keep some of

<<quoted lines omitted: 17>>

>>> To unsubscribe from this list, please send an email to >>> [rebol-request--rebol--com] with

unsubscribe" in the

>>> subject, without the quotes. >>>

<<quoted lines omitted: 17>>

>[rebol-request--rebol--com] with "unsubscribe" in the >subject, without the quotes.

-- Bohdan "Bo" Lechnowsky REBOL Adventure Guide REBOL Technologies 707-467-8000 (http://www.rebol.com) The Official Source for REBOL Books (http://www.REBOLpress.com)

[10/23] from: ddalley:idirect at: 10-Nov-2000 23:34

Hi, Bo: I have a question about this e-mail. As you can see below, the line breaks are not right. The lines that have a single quote (") on it probably had a NEWLINE of some sort on the previous line. This causes an error when run, so what is really supposed to be between the two quote marks? I figured out how to get it to run OK, but, for my purposes, the retained text needs much better formatting - close to how the author wanted to format it. Using an optional arg for the max-linelength would also help. If someone knows how and is willing to do this, I would dearly like to see an improved version. Anyone who has ever used a good HTML stripper, such as the excellent HTTX (Amiga), knows how useful they can be, when called by other programs. Thanks for the seed, Bo! -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020 On 10-Nov-00, [bo--rebol--com] wrote:

[11/23] from: ingo:2b1 at: 11-Nov-2000 10:15

Hi Donald, attached is the html-to-text engine I used in my %browser.r, it's a little rough at some edges, but might keep you going ... kind regards, Ingo Once upon a time Donald Dalley spoketh thus:

> Hi, Bo: > I have a question about this e-mail. As you can see below, the line breaks are

<<quoted lines omitted: 8>>

> (Amiga), knows how useful they can be, when called by other programs. > Thanks for the seed, Bo!

-- Attached file included as plaintext by Listar -- REBOL [ Title: "HTML to Text Converter" File: %html-to-text.r Date: 2000-06-10 Author: "Ingo Hohmann" Email: [ingo--2b1--de] Site: http://www.2b1.de/ Rights: "(c) Ingo Hohmann" Purpose: {Create text from html} Comments: { extracted from my browser.r (which should be updated to current /View, btw.) } ] html: make object! [ help: {A html parser} evaluate: false read-error: none skip: false spaces: charset " ^-^/" non-spaces: complement spaces delimiters: charset { ^-^/="} non-delimiters: complement delimiters html-source: copy "" get-html: func [][ return html-source ] find-base: func [ url [url! file!] /local u2][ if #"/" = last url [ return url ] if exists? u2: to-url rejoin [ url "/" ] [ return u2 ] first split-path url ] conv-list: [ "&" "&" "<" "<" ">" ">" """ {"} "ä" "�" "Ä" "�" "ö" "�" "Ö" "�" "ü" "�" "&Uuml" "�" "ß" "�" " " " "] clean: func [ {Converts html-entities to special-characters} text [string!] /local special entity ] [ foreach [special entity] conv-list [ replace/all text special entity ] text ] parse-tag: func [ {parses a tag, returns block of tag-name arguments} tag /local tag-name tag-params] [ name-rule: [ some non-delimiters ] param-rule: [ any spaces [ copy param-name name-rule (append tag-params param-name) any spaces [ "=" any spaces [ {"} copy param-val to {"} skip | {'} copy param-val to {'} skip | copy param-val some non-delimiters skip ] (append tag-params param-val) | (append tag-params true) ] ] ] tag-params: copy [] parse/all tag [ copy tag-name name-rule any param-rule ] compose [ (tag-name) (tag-params) ] ] read: func [ {read url and return the page as ...} url [url! file!] /html "html source" /text "text" /links "link-list" /local data txt lnk return-block ] [ return-block: copy [] either error? err: try [data: read url] [ read-error: disarm err ] [ read-error: none if html [ append return-block data ] if any [ txt links ] [ set [txt lnk] to-text url data if text [ append return-block txt ] if links [ append/only return-block lnk ] ] print dir? url ] ] to-text: func [ {Convert html to text, url is needed for handling of relative urls} url [url! file!] html [string!] /local elem txt links link lfd link-blk pos end-pos the-script script-funcs ] [ script-funcs: make object! [ print: func [val][ insert pos load/markup form join val newline ] prin: func [val][ insert pos load/markup form val ] ] url: find-base url links: copy [] link-blk: copy [] html-source: copy html html: load/markup html txt: make string! 500 lfd: 0 parse html [ some [ set elem string! ( if not skip [ if 0 < length? trim/lines elem [ append txt rejoin [ elem " "] ] ] ) | pos: set elem tag! ( elem: parse-tag elem switch first elem [ "a" [ lfd: lfd + 1 append txt rejoin [ "(" lfd ")" ] elem: select elem "href" if elem [ if all [ not find elem "://" not find elem "mailto:"] [ ; <a name="top"> ??? elem: rejoin [ url elem ] ] append links compose [ (lfd) (elem)] ] ] "img" [ either elem: select elem "alt" [ append txt rejoin [ "[" elem "]" ] ] [ append txt "[graphic]" ] ] "p" [append txt "^/^/"] "br" [append txt newline] "hr" [append txt "^/------------------------------------^/" ] "li" [append txt "^/* "] "ul" [append txt newline] "/ul" [append txt newline] "ol" [append txt newline] "/ol" [append txt newline] "div" [append txt newline] "/div" [append txt newline] "blockquote" [append txt newline] "/blockquote" [append txt newline] "style" [skip: true] "/style" [skip: false] "pre" [ end-pos: find pos </pre> append txt rejoin [ newline copy/part next pos end-pos ] pos: end-pos ] "script" [ either all [ evaluate "rebol" = select elem "language" ] [ end-pos: find pos </script> the-script: copy/part next pos end-pos remove/part pos next end-pos if error? err: try [ do bind load rejoin the-script in script-funcs 'print ] [ ;inform layout [ ; subtitle red "Error in script !" ; text mold disarm err ;] print ["Error in Script: ^/" mold disarm err] ] pos: back pos ][ skip: true ] ] ; do bind load rejoin n in t 'print "/script" [skip: false] "/title" [append txt "^/^/"] ] pos: next pos ) :pos ] ] txt: clean txt append txt "^/^/^/The links:^/-------^/^/" foreach [lfd link] links [ append txt rejoin [ lfd " " link newline]] foreach [lfd link] links [ if not find link "mailto:" [append link-blk rejoin [ lfd " " link]]] return compose/deep [ (txt) [(link-blk)]] ] ]

[12/23] from: ddalley:idirect at: 11-Nov-2000 11:11

On 11-Nov-00, Ingo Hohmann wrote:

> attached is the html-to-text engine I used in my %browser.r, > it's a little rough at some edges, but might keep you going ...

Thanks, Ingo, but I need some help with usage, especially for locating the HTML source. -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020

[13/23] from: ingo:2b1 at: 12-Nov-2000 11:15

Hi Donald, well, yes, thought that, too ... _after_ I sent it, and when I dindn't have the time to send a followup ... I originally wanted a stand-alone html-to-text engine, but when I created the browser.r script I got a little distracted. These are the two mainfunctions:

>> help html/read

Valid subpath "html/read" is function: USAGE: OBJ url /html /text /links DESCRIPTION: read url and return the page as ... OBJ is a function value. ARGUMENTS: url -- (Type: url file) REFINEMENTS: /html -- html source /text -- text /links -- link-list 'Read will read a file/url, and return nothing, because I now see that I don't really return the assembled return-value ... it _should_ have been a block containing the string values for /html source /text converted from html /links found in the source

>> help html/to-text

Valid subpath "html/to-text" is function: USAGE: OBJ url html DESCRIPTION: Convert html to text, url is needed for handling of relative urls OBJ is a function value. ARGUMENTS: url -- (Type: url file) html -- (Type: string) 'to-text returns a block containing the converted text, and a block of strings like "cnt http://www.2b1.de/" where cnt is a reference number inserted in the text, where the link has been. (The 'url argument is really only needed to convert relative urls.) I hope that helps kind regards, Ingo Once upon a time Donald Dalley spoketh thus:

[14/23] from: brett:codeconscious at: 13-Nov-2000 3:07

> Anyone who has ever used a good HTML stripper, such as the excellent HTTX > (Amiga), knows how useful they can be, when called by other programs.

How does such a thing deal with tables and other anti-text HTML format elements? Brett.

[15/23] from: al:bri:xtra at: 13-Nov-2000 7:56

> > Anyone who has ever used a good HTML stripper, such as the excellent HTTX (Amiga), knows how useful they can be, when called by other programs. > How does such a thing deal with tables and other anti-text HTML format elements?

Convert to preformated using tabs or lots of spaces? Andrew Martin

[16/23] from: brett:codeconscious at: 13-Nov-2000 10:50

> > > Anyone who has ever used a good HTML stripper, such as the excellent

HTTX (Amiga), knows how useful they can be, when called by other programs.

> > How does such a thing deal with tables and other anti-text HTML format

elements?

> Convert to preformated using tabs or lots of spaces?

I was thinking more from the input side rather than the output side. Tables for example, are quite often used for layout reasons as opposed to delivering tabular data. Throw in the use of colspan and the fact that some cells will contain navigation and other cells will contain content or even nested tables and it becomes somewhat confusing. I was curious as to how these things are treated by the HTML stripper. Or is it a GUI thing, that you highlight certain areas that you want extracted? Brett.

[17/23] from: ddalley:idirect at: 13-Nov-2000 1:51

Hi, Brett: On 12-Nov-00, Brett Handley wrote:

> > Anyone who has ever used a good HTML stripper, such as the excellent HTTX > > (Amiga), knows how useful they can be, when called by other programs. > How does such a thing deal with tables and other anti-text HTML format > elements?

For my needs, HTTX strips tables just fine - all I need is the contents; I don't care about its final format. The stripped format depends upon the author's talents (or lack thereof). Sometimes the stripped format clearly resembles the original table; sometimes its a jumbled mess. So long as the format of each page is at least somewhat consistent, I can find what I want. Parsing does all of the hard work. Normally, non-text elements (HTML format controls) are of no value to me and are fully eliminated. -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020

[18/23] from: ddalley:idirect at: 13-Nov-2000 2:02

On 12-Nov-00, Brett Handley wrote:

> > > How does such a thing deal with tables and other anti-text HTML format > elements?

<<quoted lines omitted: 5>>

> these things are treated by the HTML stripper. Or is it a GUI thing, that > you highlight certain areas that you want extracted?

Try some and find out, Brett. It depends upon the author of the source and the talents of the stripper program, too. You need to answer (to yourself): how is the stripped text going to be used, by you or another program? If you need to be able to easily read a complicated table, you may be out of luck, but, for the most part, it all works just fine -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020

[19/23] from: bo:rebol at: 13-Nov-2000 10:16

Donald, Using the old autoextract function found in the script library, I am trying this one more time. This time, the file should retain everything it is supposed to! Just DO this email. Self-extracting REBOL-compressed file REBOL [ Title: "Self-extracting compressed file" Date: 13-Nov-2000/10:06:12-7:00 File: %striptags.r Author: "Autoextract function by Bohdan Lechnowsky" Comment: { Simply run this script and it will decompress and save the file for you } ] if exists? %striptags.r [ print {striptags.r already exists, please rename existing file and run again.} halt ] write %striptags.r decompress #{ 789C8594DF6E9B3014C6AFC3539C31A96A2F18EBFE49A551B6648D944C9956A5 DC21221962C0AAB1996DD655599F6BAF3763023169A32184B07FDF777C7CC067 3D9FFD5841E480BE42A2280E00DC45F87D0521CAE14E09525558B886DF20D560 78F7D6FB5653EFF2EAEACACC4F6B557011803BE3C5163158E1B460FC41DE3FB6 BE798908D5C6847F1138E1F44DCA4B036E6B5171A963EECCD0E4C041368B02CF 32307928944BA018FD222C07CEE82328FC5B41820BC2B6C6F7E4C48E635C8D36 80AC66294415CA31F894A788B60ECCB6B1DEA9332A6BAA88277045518A3BB933 1AEDD6ED9404A3A82806A2702981304090118A9FB4AACA4791C48260F92A865D 58606847A038EC63F6B6469FD07B88129DC77DA39F8279D5DBD33624D20210DB F63E4C71899992DA17471917186941241B99A8681A8309B657FB8852A8723860 5D86A61ED70A253A77BD84A95DB340A96381AC494BA6775F974B480B2450AAB0 90C6D5D42B80416D40D7B4F429617A77A69C51FF9DDC71B80C57F3896B06E65D FF0016F60FDCDDF81631773FB02DE1CDA443EEC6FBB3F186F17AFA0C9A713BF9 82335C1FC21EA5B83E84F40796E96C3577F7603CF434E8856D8D6FFB558EC96A 69EDCBDF787F0759CCD636B5C8194B6475ED3E4FFC2C57D7BD6562036A013BED B394578F5DA8F3F4C246A8EC16D1031BFCAC79176EE73ED935585C9EA89BBF78 7792BC3F493E9C241F4F924F27C8626D7F05AFBFFACF189B67D3100268AA02AE DB9E1AA41B9D3E29E63FEF8AB79F33EDA311FB1512AA9564E458FC50E80ED1F6 9DC0605FE9BEB7D74CDCD83A3D2483F38C08D9C6BA80F1045EEB309662AFCA78 CDB69F9BD6150C573C92FE2759BCEF94DD153BC3B7F629B0AA051BC431D58D9D 7F453C7D5B26060000 } quit On 10-Nov-2000/23:34:19, [ddalley--idirect--com] wrote:

>Hi, Bo: >I have a question about this e-mail. As you can see below, the line breaks are

<<quoted lines omitted: 13>>

> http://webhome.idirect.com/~ddalley > UIN/ICQ#: 65203020

-- Bohdan "Bo" Lechnowsky REBOL Adventure Guide REBOL Technologies 707-467-8000 (http://www.rebol.com) The Official Source for REBOL Books (http://www.REBOLpress.com)

[20/23] from: ddalley:idirect at: 13-Nov-2000 19:51

Hi, Bo: On 13-Nov-00, [bo--rebol--com] wrote:

> Using the old autoextract function found in the script library, I am trying > this one more time. This time, the file should retain everything it is > supposed to! Just DO this email.

That worked. The script failed on some odd/poor HTML code that I couldn't figure out how to trap:

>> WRITE %test.txt DO striptags READ %test.html

** Syntax Error: Invalid word -- -->. ** Where: (line 4) //--> --> The only place that "//--> --> " appears is in the 4th line of the written file, not the original. The original has "//-->" & "-->" on separate lines, so these two are being joined together. Any ideas? Otherwise, your program does its job well. Thanks! -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020

[21/23] from: al:bri:xtra at: 14-Nov-2000 16:32

> The script failed on some odd/poor HTML code that I couldn't figure out

how to trap:

> >> WRITE %test.txt DO striptags READ %test.html > ** Syntax Error: Invalid word -- -->. > ** Where: (line 4) //--> --> > > The only place that "//--> --> " appears is in the 4th line of the

written file, not the original. The original has "//-->" & "-->" on separate lines, so these two are being joined together.

> Any ideas?

Try this line instead: do striptags.r WRITE %test.txt striptags READ %test.html 'do-ing the text returned from 'striptags caused this error:

> ** Syntax Error: Invalid word -- -->. > ** Where: (line 4) //--> -->

I hope that helps! Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

[22/23] from: ddalley:idirect at: 14-Nov-2000 4:10

On 14-Nov-00, Andrew Martin wrote:

> Try this line instead: > do striptags.r WRITE %test.txt striptags READ %test.html

Andrew: the result is:

>> do striptags.r WRITE %test.txt striptags READ %test.html

** Script Error: striptags.r has no value. ** Where: do striptags.r WRITE %test.txt striptags

> 'do-ing the text returned from 'striptags caused this error: > ** Syntax Error: Invalid word -- -->. > ** Where: (line 4) //--> -->

Ah, OK. When I invoked the strip operation, then did a write on a separate line, there was no error. I was trying to think of a way to run striptags.r at the same time, for times when it has not already been run. Obviously, I did it wrongly. I still have not removed the two odd lines of HTML, though. -- ---===///||| Donald Dalley |||\\\===--- The World of AmiBroker Support http://webhome.idirect.com/~ddalley UIN/ICQ#: 65203020

[23/23] from: al:bri:xtra at: 14-Nov-2000 23:42

Donald wrote:

> Andrew: the result is: > > >> do striptags.r WRITE %test.txt striptags READ %test.html > ** Script Error: striptags.r has no value. > ** Where: do striptags.r WRITE %test.txt striptags

I forgot to write the "%". It should have read: do %striptags.r WRITE %test.txt striptags READ %test.html Donald wrote:

> ...still have not removed the two odd lines of HTML, though.

Try this version. Note that I've added in: "//-->" "" "-->" "" in multi-replace. If there's any other strange stuff in there, just do as I did and extend Striptag yourself, to remove those characters. Why not give it a go? [ REBOL [ Title: "HTML Tag Stripper" Date: 20-Jul-1999 Author: "Bohdan Lechnowsky" Email: [bo--rebol--com] Purpose: { To strip off HTML tags leaving only text behind } ] striptags: func [page /local text end] [ multi-replace: func [ {Replaces multiple items in a file} pg [series!] {The series to replace items in} blk [block!] {A block of search and replace elements} ][ foreach [srch rplc] blk [ replace/all pg srch rplc ] ] ;table of tags and more suitable ASCII characters page: multi-replace trim/lines page [ "<TITLE>" "TITLE: " "</TITLE>" "^/" " " " " "<TD>" "^-|^-" "</TD>" "^-|^-" "^-|^-^-|^-" "^-|^-" "<TR>" " " "</TR>" "^/" "<TABLE" "^/<" "</TABLE>" "^/" "<P>" "^/" "<LI>" "^/^-� " "<BR>" "^/" " " " " ">" ">" "<" "<" "©" "(c)" "&" "&" """ {"} "</H1>" "^/" "</H2>" "^/" "</H3>" "^/" "</H4>" "^/" "</H5>" "^/" "</H6>" "^/" "<HR" "^/----------^/<" "//-->" "" "-->" "" ] text: copy "" append page "<" append text copy/part page find page "<" while [page: find/tail page ">"] [ if (first page) <> #"<" [ if found? end: find page "<" [ append text copy/part page end ] ] ] return append text "^/" ] ] I hope that helps! Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted