Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Parsing question

 [1/5] from: hopeless:eircom at: 4-Oct-2000 12:52


This is a multi-part message in MIME format. ------=_NextPart_000_0008_01C02E01.F14F87A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit is it possible to parse a page with the following pseudo-rule: copy from the 5th occurrence of "something" to the 9th occurrence of whatever I have found lots of references to thru and to but I can't quite get my head around how to use them to perform this. Many Thanks, Jamie ------=_NextPart_000_0008_01C02E01.F14F87A0 Content-Type: application/octet-stream; name="Jamie Lawrence.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="Jamie Lawrence.vcf" BEGIN:VCARD VERSION:2.1 N:Lawrence;Jamie;;; FN:Jamie Lawrence ORG:Broadcom Eireann Research;Telecoms Services TITLE:Senior Researcher NOTE;ENCODING=QUOTED-PRINTABLE:ICQ #48946912 (nickname: hopeless)=0D=0AWeblog: http://hopeless.weblogs.com=0D=0A=0D=0A-----BEGIN PGP PUBLIC KEY BLOCK-----=0D=0AVersion: PGPfreeware 6.5.1 Int. for non-commercial use <http://www.pgpinternational.com>=0D=0A=0D=0AmQGiBDnAqjERBADC7nro+HeL6x2JbtD3EoVnMnDlCkNpaSf18nJM0wf1CLo2f2Jx=0D=0AY9YtcUajRaDZINv0k/kFvGCVzZJUM+TPpCBVVj5lgAHILyhjl6gyZMeRxsmDHhCc=0D=0AV7xK9BF6QdioG+nmLUHLDRcKoC5ND7fc7uAUaGEhCYc5E7QXy0xKqWU91QCg/+1A=0D=0A2Zhk/6xfoxNvKLKzMaTYud8EAJ4kXiEkjtw5KHosE7lj6HDvytmUCAvyb+j+xabm=0D=0A9obr60o/tsrF9w3Nq3QtDkx8nArO1GI1UdQHf80+HMoyPAW+3SZ26oRP/abEqsHb=0D=0A89YwAamqpuYkus394ZJfeQiekQhl9/dqUU9LU0M6N5A/M6GQdY7ERWeggwWv5xpE=0D=0AKyILA/9yQe7NYAESlBtIf8vbiIlVxrcEWrqnbSP3Jy5Ba/Syjl1BQpEUXLE4w+o8=0D=0AB0vgQ8RH2PdujZz8pRCB0HwlSXujvRrysSG+Za81qEdp0npenn+cvNwFKr+CpkFN=0D=0ACS71ZO7kbSlvlLBBMBrRRLgzTcGeoXgtjtKOw6Tyqz7hfqdwH7QkSmFtaWUgTGF3=0D=0AcmVuY2UgPGhvcGVsZXNzQGVpcmNvbS5uZXQ+iQBOBBARAgAOBQI5wKoxBAsDAgEC=0D=0AGQEACgkQVJ3PupC4VceRtQCghym0/H7c27IsuijVka4mdNhhj0wAniLTwQZEwTrb=0D=0ABbQxUeAds/bj2osBtCtKYW1pZSBMYXdyZW5jZSA8amFtaWUubGF3cmVuY2VAYnJv=0D=0AYWRjb20uaWU+iQBLBBARAgALBQI5wMETBAsDAgEACgkQVJ3PupC4VccN8ACeN8Qj=0D=0ANi1S71WPc+ksovjGoqqfeYYAn3dPoOYYaX480jB8RcEOiSvf/33TtCBKYW1pZSBM=0D=0AYXdyZW5jZSA8amxlQGJyb2FkY29tLmllPokASwQQEQIACwUCOcDBbwQLAwIBAAoJ=0D=0AEFSdz7qQuFXHgNIAn1lg9+VDnnpY6MjW+kJHTwmzYd4lAKC6x2SlOWVkDXqPU7Z2=0D=0ADJlZ5nkU8bkCDQQ5wKoxEAgA9kJXtwh/CBdyorrWqULzBej5UxE5T7bxbrlLOCDa=0D=0AAadWoxTpj0BV89AHxstDqZSt90xkhkn4DIO9ZekX1KHTUPj1WV/cdlJPPT2N286Z=0D=0A4VeSWc39uK50T8X8dryDxUcwYc58yWb/Ffm7/ZFexwGq01uejaClcjrUGvC/RgBY=0D=0AK+X0iP1YTknbzSC0neSRBzZrM2w4DUUdD3yIsxx8Wy2O9vPJI8BD8KVbGI2Ou1WM=0D=0AuF040zT9fBdXQ6MdGGzeMyEstSr/POGxKUAYEY18hKcKctaGxAMZyAcpesqVDNmW=0D=0An6vQClCbAkbTCD1mpF1Bn5x8vYlLIhkmuquiXsNV6TILOwACAgf/U1VI3/Kh4JPy=0D=0AqRz0jO6E8BKd+C/1aQIKZV74f55ZNSIunFOKr2wNcvXEtmvLbnH398dQxKGGkMNF=0D=0AGdOGWmRhkGNnDg9Fa3BVnGVS/96y7Mw//TpeuZWczaTlXs2qY1gvIpnN2C6/QXGU=0D=0A8wMeBxlKGX8kUQ1ju0q1XIRqkX267WrGanExtscc57l9OXAUUQqsNbhQ+fwHnECT=0D=0ABSD99f3jxDm2vu8M5DAz0F/4wg/xFDsT2D0qKCt/Y0zvuotfSzTO3HNLf1zxq8at=0D=0AsVzEHTdUc3Wn6bdMQ2kGZHm6P1RuksjDASsdPyrz6gSYMZ/JFslohhoZ6gfk2VPm=0D=0AGagGmFC/wYkARgQYEQIABgUCOcCqMQAKCRBUnc+6kLhVx7w0AKCwaUYgB7RY/by3=0D=0A7lPV2J1DJgymmACdEk5xswqnDCsvEVrAuGeD9i+NaGE==0D=0A=46La=0D=0A-----END PGP PUBLIC KEY BLOCK-----=0D=0A TEL;WORK;VOICE:+353 1 6046035 TEL;WORK;VOICE: TEL;HOME;VOICE: TEL;CELL;VOICE:+353 86 8035183 TEL;CAR;VOICE: TEL;VOICE: TEL;PAGER;VOICE: TEL;WORK;FAX: TEL;HOME;FAX: TEL;HOME: TEL;ISDN: TEL;PREF: ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Kestrel House,=0D=0AClanwilliam Place,;Dublin 2;;;Ireland LABEL;WORK;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0AKestrel House,=0D=0AClanwilliam Place,=0D=0ADublin 2, =0D=0AIreland ADR;HOME:;;;;;; LABEL;HOME;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A ADR;POSTAL:;;;;;; LABEL;POSTAL;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A URL:http://hopeless.weblogs.com/ URL: ROLE: EMAIL;PREF;INTERNET:[jamie--lawrence--broadcom--ie] EMAIL;INTERNET:[hopeless--eircom--net] EMAIL;INTERNET:[jle--broadcom--ie] EMAIL;TLX: REV:20001004T172935Z END:VCARD ------=_NextPart_000_0008_01C02E01.F14F87A0--

 [2/5] from: brett:codeconscious at: 5-Oct-2000 0:47


Short answer - It depends (doesn't it always :) ) Long answer, If I assume there is no relationship between the 5th occurence of something (A) and the 9th occurrence of "whatever" (B) then the answer is no - you probably cannot do it in one rule. The reason being is that you want parse to essentially go back and process the input a second time in order to find B. So, need to call parse once to find (A), again to find (B) and then just do a copy/part A B. If however, there is a relationship, like all 9 occurrences of "whatever" always follow A then yes you can probably do it in a single parse rule. So what you need to do to use parse is to identify the rules that the page structure follows - or at least a pattern that you can exploit. If you have a concrete example, it might help. Brett.

 [3/5] from: hopeless:eircom at: 4-Oct-2000 15:31


This is a multi-part message in MIME format. ------=_NextPart_000_0015_01C02E18.23B1EDD0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit okay maybe I was being too generic. Example, parsing a web page to locate the fifth table. This basically involves looking for the fifth occurrence of "<table>" and then the second occurrence of "</table>" (as there is an embedded table in one of the cells). <html> <body> <table></table> <table></table> <table></table> <table></table> <table> <tr> <td><table></table></td> </tr> </table> </body> </html> The content I'd like to be copied is "<tr><td><table></table></td></tr>" (don't ask why I'd want this) Hope that clears up what I'm trying to do. Cheers, Jamie
> -----Original Message----- > From: [brett--codeconscious--com] [mailto:[brett--codeconscious--com]]
<<quoted lines omitted: 35>>
> > > >
------=_NextPart_000_0015_01C02E18.23B1EDD0 Content-Type: application/octet-stream; name="Jamie Lawrence.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="Jamie Lawrence.vcf" BEGIN:VCARD VERSION:2.1 N:Lawrence;Jamie;;; FN:Jamie Lawrence ORG:Broadcom Eireann Research;Telecoms Services TITLE:Senior Researcher NOTE;ENCODING=QUOTED-PRINTABLE:ICQ #48946912 (nickname: hopeless)=0D=0AWeblog: http://hopeless.weblogs.com=0D=0A=0D=0A-----BEGIN PGP PUBLIC KEY BLOCK-----=0D=0AVersion: PGPfreeware 6.5.1 Int. for non-commercial use <http://www.pgpinternational.com>=0D=0A=0D=0AmQGiBDnAqjERBADC7nro+HeL6x2JbtD3EoVnMnDlCkNpaSf18nJM0wf1CLo2f2Jx=0D=0AY9YtcUajRaDZINv0k/kFvGCVzZJUM+TPpCBVVj5lgAHILyhjl6gyZMeRxsmDHhCc=0D=0AV7xK9BF6QdioG+nmLUHLDRcKoC5ND7fc7uAUaGEhCYc5E7QXy0xKqWU91QCg/+1A=0D=0A2Zhk/6xfoxNvKLKzMaTYud8EAJ4kXiEkjtw5KHosE7lj6HDvytmUCAvyb+j+xabm=0D=0A9obr60o/tsrF9w3Nq3QtDkx8nArO1GI1UdQHf80+HMoyPAW+3SZ26oRP/abEqsHb=0D=0A89YwAamqpuYkus394ZJfeQiekQhl9/dqUU9LU0M6N5A/M6GQdY7ERWeggwWv5xpE=0D=0AKyILA/9yQe7NYAESlBtIf8vbiIlVxrcEWrqnbSP3Jy5Ba/Syjl1BQpEUXLE4w+o8=0D=0AB0vgQ8RH2PdujZz8pRCB0HwlSXujvRrysSG+Za81qEdp0npenn+cvNwFKr+CpkFN=0D=0ACS71ZO7kbSlvlLBBMBrRRLgzTcGeoXgtjtKOw6Tyqz7hfqdwH7QkSmFtaWUgTGF3=0D=0AcmVuY2UgPGhvcGVsZXNzQGVpcmNvbS5uZXQ+iQBOBBARAgAOBQI5wKoxBAsDAgEC=0D=0AGQEACgkQVJ3PupC4VceRtQCghym0/H7c27IsuijVka4mdNhhj0wAniLTwQZEwTrb=0D=0ABbQxUeAds/bj2osBtCtKYW1pZSBMYXdyZW5jZSA8amFtaWUubGF3cmVuY2VAYnJv=0D=0AYWRjb20uaWU+iQBLBBARAgALBQI5wMETBAsDAgEACgkQVJ3PupC4VccN8ACeN8Qj=0D=0ANi1S71WPc+ksovjGoqqfeYYAn3dPoOYYaX480jB8RcEOiSvf/33TtCBKYW1pZSBM=0D=0AYXdyZW5jZSA8amxlQGJyb2FkY29tLmllPokASwQQEQIACwUCOcDBbwQLAwIBAAoJ=0D=0AEFSdz7qQuFXHgNIAn1lg9+VDnnpY6MjW+kJHTwmzYd4lAKC6x2SlOWVkDXqPU7Z2=0D=0ADJlZ5nkU8bkCDQQ5wKoxEAgA9kJXtwh/CBdyorrWqULzBej5UxE5T7bxbrlLOCDa=0D=0AAadWoxTpj0BV89AHxstDqZSt90xkhkn4DIO9ZekX1KHTUPj1WV/cdlJPPT2N286Z=0D=0A4VeSWc39uK50T8X8dryDxUcwYc58yWb/Ffm7/ZFexwGq01uejaClcjrUGvC/RgBY=0D=0AK+X0iP1YTknbzSC0neSRBzZrM2w4DUUdD3yIsxx8Wy2O9vPJI8BD8KVbGI2Ou1WM=0D=0AuF040zT9fBdXQ6MdGGzeMyEstSr/POGxKUAYEY18hKcKctaGxAMZyAcpesqVDNmW=0D=0An6vQClCbAkbTCD1mpF1Bn5x8vYlLIhkmuquiXsNV6TILOwACAgf/U1VI3/Kh4JPy=0D=0AqRz0jO6E8BKd+C/1aQIKZV74f55ZNSIunFOKr2wNcvXEtmvLbnH398dQxKGGkMNF=0D=0AGdOGWmRhkGNnDg9Fa3BVnGVS/96y7Mw//TpeuZWczaTlXs2qY1gvIpnN2C6/QXGU=0D=0A8wMeBxlKGX8kUQ1ju0q1XIRqkX267WrGanExtscc57l9OXAUUQqsNbhQ+fwHnECT=0D=0ABSD99f3jxDm2vu8M5DAz0F/4wg/xFDsT2D0qKCt/Y0zvuotfSzTO3HNLf1zxq8at=0D=0AsVzEHTdUc3Wn6bdMQ2kGZHm6P1RuksjDASsdPyrz6gSYMZ/JFslohhoZ6gfk2VPm=0D=0AGagGmFC/wYkARgQYEQIABgUCOcCqMQAKCRBUnc+6kLhVx7w0AKCwaUYgB7RY/by3=0D=0A7lPV2J1DJgymmACdEk5xswqnDCsvEVrAuGeD9i+NaGE==0D=0A=46La=0D=0A-----END PGP PUBLIC KEY BLOCK-----=0D=0A TEL;WORK;VOICE:+353 1 6046035 TEL;WORK;VOICE: TEL;HOME;VOICE: TEL;CELL;VOICE:+353 86 8035183 TEL;CAR;VOICE: TEL;VOICE: TEL;PAGER;VOICE: TEL;WORK;FAX: TEL;HOME;FAX: TEL;HOME: TEL;ISDN: TEL;PREF: ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Kestrel House,=0D=0AClanwilliam Place,;Dublin 2;;;Ireland LABEL;WORK;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0AKestrel House,=0D=0AClanwilliam Place,=0D=0ADublin 2, =0D=0AIreland ADR;HOME:;;;;;; LABEL;HOME;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A ADR;POSTAL:;;;;;; LABEL;POSTAL;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A URL:http://hopeless.weblogs.com/ URL: ROLE: EMAIL;PREF;INTERNET:[jamie--lawrence--broadcom--ie] EMAIL;INTERNET:[hopeless--eircom--net] EMAIL;INTERNET:[jle--broadcom--ie] EMAIL;TLX: REV:20001004T172935Z END:VCARD ------=_NextPart_000_0015_01C02E18.23B1EDD0--

 [4/5] from: rebol:svendx:dk at: 4-Oct-2000 18:08


Hello [hopeless--eircom--net], Here's a parse rule that does the trick: my-table-rule: [ 4 [thru "<table>" thru "</table>"] 1 [thru "<table>" copy my-text [thru "</table>" to "</table>"]] to end ] to end isn't strictly nessesary, but will make parse return true if the text was succesfully copied. Best regards Thomas Jensen On 04-Oct-00, [hopeless--eircom--net] wrote:

 [5/5] from: hopeless::eircom::net at: 4-Oct-2000 17:24


This is a multi-part message in MIME format. ------=_NextPart_000_0022_01C02E27.E4F4E1F0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit aha! All so simple when you realise you can put numbers in there!!! I missed that is the new manual. Cheers,
> -----Original Message----- > From: [rebol--svendx--dk] [mailto:[rebol--svendx--dk]]
<<quoted lines omitted: 42>>
> > > > Jamie
------=_NextPart_000_0022_01C02E27.E4F4E1F0 Content-Type: application/octet-stream; name="Jamie Lawrence.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="Jamie Lawrence.vcf" BEGIN:VCARD VERSION:2.1 N:Lawrence;Jamie;;; FN:Jamie Lawrence ORG:Broadcom Eireann Research;Telecoms Services TITLE:Senior Researcher NOTE;ENCODING=QUOTED-PRINTABLE:ICQ #48946912 (nickname: hopeless)=0D=0AWeblog: http://hopeless.weblogs.com=0D=0A=0D=0A-----BEGIN PGP PUBLIC KEY BLOCK-----=0D=0AVersion: PGPfreeware 6.5.1 Int. for non-commercial use <http://www.pgpinternational.com>=0D=0A=0D=0AmQGiBDnAqjERBADC7nro+HeL6x2JbtD3EoVnMnDlCkNpaSf18nJM0wf1CLo2f2Jx=0D=0AY9YtcUajRaDZINv0k/kFvGCVzZJUM+TPpCBVVj5lgAHILyhjl6gyZMeRxsmDHhCc=0D=0AV7xK9BF6QdioG+nmLUHLDRcKoC5ND7fc7uAUaGEhCYc5E7QXy0xKqWU91QCg/+1A=0D=0A2Zhk/6xfoxNvKLKzMaTYud8EAJ4kXiEkjtw5KHosE7lj6HDvytmUCAvyb+j+xabm=0D=0A9obr60o/tsrF9w3Nq3QtDkx8nArO1GI1UdQHf80+HMoyPAW+3SZ26oRP/abEqsHb=0D=0A89YwAamqpuYkus394ZJfeQiekQhl9/dqUU9LU0M6N5A/M6GQdY7ERWeggwWv5xpE=0D=0AKyILA/9yQe7NYAESlBtIf8vbiIlVxrcEWrqnbSP3Jy5Ba/Syjl1BQpEUXLE4w+o8=0D=0AB0vgQ8RH2PdujZz8pRCB0HwlSXujvRrysSG+Za81qEdp0npenn+cvNwFKr+CpkFN=0D=0ACS71ZO7kbSlvlLBBMBrRRLgzTcGeoXgtjtKOw6Tyqz7hfqdwH7QkSmFtaWUgTGF3=0D=0AcmVuY2UgPGhvcGVsZXNzQGVpcmNvbS5uZXQ+iQBOBBARAgAOBQI5wKoxBAsDAgEC=0D=0AGQEACgkQVJ3PupC4VceRtQCghym0/H7c27IsuijVka4mdNhhj0wAniLTwQZEwTrb=0D=0ABbQxUeAds/bj2osBtCtKYW1pZSBMYXdyZW5jZSA8amFtaWUubGF3cmVuY2VAYnJv=0D=0AYWRjb20uaWU+iQBLBBARAgALBQI5wMETBAsDAgEACgkQVJ3PupC4VccN8ACeN8Qj=0D=0ANi1S71WPc+ksovjGoqqfeYYAn3dPoOYYaX480jB8RcEOiSvf/33TtCBKYW1pZSBM=0D=0AYXdyZW5jZSA8amxlQGJyb2FkY29tLmllPokASwQQEQIACwUCOcDBbwQLAwIBAAoJ=0D=0AEFSdz7qQuFXHgNIAn1lg9+VDnnpY6MjW+kJHTwmzYd4lAKC6x2SlOWVkDXqPU7Z2=0D=0ADJlZ5nkU8bkCDQQ5wKoxEAgA9kJXtwh/CBdyorrWqULzBej5UxE5T7bxbrlLOCDa=0D=0AAadWoxTpj0BV89AHxstDqZSt90xkhkn4DIO9ZekX1KHTUPj1WV/cdlJPPT2N286Z=0D=0A4VeSWc39uK50T8X8dryDxUcwYc58yWb/Ffm7/ZFexwGq01uejaClcjrUGvC/RgBY=0D=0AK+X0iP1YTknbzSC0neSRBzZrM2w4DUUdD3yIsxx8Wy2O9vPJI8BD8KVbGI2Ou1WM=0D=0AuF040zT9fBdXQ6MdGGzeMyEstSr/POGxKUAYEY18hKcKctaGxAMZyAcpesqVDNmW=0D=0An6vQClCbAkbTCD1mpF1Bn5x8vYlLIhkmuquiXsNV6TILOwACAgf/U1VI3/Kh4JPy=0D=0AqRz0jO6E8BKd+C/1aQIKZV74f55ZNSIunFOKr2wNcvXEtmvLbnH398dQxKGGkMNF=0D=0AGdOGWmRhkGNnDg9Fa3BVnGVS/96y7Mw//TpeuZWczaTlXs2qY1gvIpnN2C6/QXGU=0D=0A8wMeBxlKGX8kUQ1ju0q1XIRqkX267WrGanExtscc57l9OXAUUQqsNbhQ+fwHnECT=0D=0ABSD99f3jxDm2vu8M5DAz0F/4wg/xFDsT2D0qKCt/Y0zvuotfSzTO3HNLf1zxq8at=0D=0AsVzEHTdUc3Wn6bdMQ2kGZHm6P1RuksjDASsdPyrz6gSYMZ/JFslohhoZ6gfk2VPm=0D=0AGagGmFC/wYkARgQYEQIABgUCOcCqMQAKCRBUnc+6kLhVx7w0AKCwaUYgB7RY/by3=0D=0A7lPV2J1DJgymmACdEk5xswqnDCsvEVrAuGeD9i+NaGE==0D=0A=46La=0D=0A-----END PGP PUBLIC KEY BLOCK-----=0D=0A TEL;WORK;VOICE:+353 1 6046035 TEL;WORK;VOICE: TEL;HOME;VOICE: TEL;CELL;VOICE:+353 86 8035183 TEL;CAR;VOICE: TEL;VOICE: TEL;PAGER;VOICE: TEL;WORK;FAX: TEL;HOME;FAX: TEL;HOME: TEL;ISDN: TEL;PREF: ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Kestrel House,=0D=0AClanwilliam Place,;Dublin 2;;;Ireland LABEL;WORK;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0AKestrel House,=0D=0AClanwilliam Place,=0D=0ADublin 2, =0D=0AIreland ADR;HOME:;;;;;; LABEL;HOME;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A ADR;POSTAL:;;;;;; LABEL;POSTAL;ENCODING=QUOTED-PRINTABLE:=0D=0A=0D=0A, =0D=0A URL:http://hopeless.weblogs.com/ URL: ROLE: EMAIL;PREF;INTERNET:[jamie--lawrence--broadcom--ie] EMAIL;INTERNET:[hopeless--eircom--net] EMAIL;INTERNET:[jle--broadcom--ie] EMAIL;TLX: REV:20001004T172935Z END:VCARD ------=_NextPart_000_0022_01C02E27.E4F4E1F0--

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted