Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Parse's current index

From: hallvard::ystad::helpinhand::com at: 17-Jan-2002 10:47

Hello everyone I've got a decode-url function from somewhere, did a search to find out where, but didn't succeed. Have searched the escribe site as well, but with no luck. (Did I write it myself?). Here's the code: decode-url: func [to-decode /local hex] [ hex: charset "0123456789ABCDEFabcdef" parse/all to-decode [some [copy entity insert-point: ["%" 2 hex] ( insert-point: remove/part insert-point 3 insert insert-point to-char to-integer to-issue next entity) | skip ]] to-decode ] Now I discovered that the code has a problem: once it finds an entity, it replaces three characters with one. As the parse continues, of two adjacent entities, only the first will be replaced, since parse suddenly finds itself in the middle of the next one after the replace:
>> decode-url "http%3A%2F%2Fwww.rebol.com%2F"
== "http:%2F/www.rebol.com/" I looked at different parse tutorials, including yours, Brett, to manipulate parse's index. But look at this: decode-url: func [to-decode /local hex] [ hex: charset "0123456789ABCDEFabcdef" parse/all to-decode [some [copy entity insert-point: ["%" 2 hex] ( insert-point: remove/part insert-point 3 insert insert-point to-char to-integer to-issue next entity print join "entity: " entity print join "instert-point after replace: " insert-point ) | (print join "not %: " insert-point ) skip ]] to-decode ]
>> print decode-url "http%3A%2F%2Fwww.rebol.com%2F"
not %: http%3A%2F%2Fwww.rebol.com%2F not %: ttp%3A%2F%2Fwww.rebol.com%2F not %: tp%3A%2F%2Fwww.rebol.com%2F not %: p%3A%2F%2Fwww.rebol.com%2F entity: %3A instert-point after replace: :%2F%2Fwww.rebol.com%2F not %: F%2Fwww.rebol.com%2F entity: %2F instert-point after replace: /www.rebol.com%2F not %: w.rebol.com%2F not %: .rebol.com%2F not %: rebol.com%2F not %: ebol.com%2F not %: bol.com%2F not %: ol.com%2F not %: l.com%2F not %: .com%2F not %: com%2F not %: om%2F not %: m%2F entity: %2F instert-point after replace: / not %: / not %: http:%2F/www.rebol.com/ So the insert-point is perfectly well situated to continue, but it seems once an entity is evaluated and replaced, 'parse continues at the index where it left of *in*the*original*string*. Suppose this is only natural and as it should be, but I haven't had enough coffee to find a workaround this morning. (except this: replace/all the_url "%3A" ":" replace/all the_url "%2F" "/" replace/all the_url "\" "/" but I'd prefer my decode-url method to work). Do I have to rewrite the rule to look only for "%", so that the next two characters are untouched? ~H