Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Another question

 [1/15] from: stopm::mediaone::net at: 14-Feb-2002 12:46


I'm trying to figure out parse... how, for instance, would I use a parse expression to break some text into words, which are grouped with quotation marks (so you could make "foo bar" one word), but quotation marks preceded by a backslash are treated as normal text? Hello world, I said -> ["Hello world", "I", "said"] Also, how would I match on a left bracket, any member of a block, and then a right bracket? Here's a pseudocode example: [ FOLLOWED BY ("red" OR "green" OR "blue") FOLLOWED BY "]" Alex

 [2/15] from: greggirwin:mindspring at: 14-Feb-2002 12:43


Hi Alex, << I'm trying to figure out parse... how, for instance, would I use a parse expression to break some text into words, which are grouped with quotation marks (so you could make "foo bar" one word), but quotation marks preceded by a backslash are treated as normal text? Hello world, I said -> ["Hello world", "I", "said"] >> Do you need to escape the quotes (i.e. is there source data that contains them)? If not, parse's behavior should be pretty darn close:
>> parse {"Hello world," I said} none
== ["Hello world," "I" "said"] Hmmm. If the escapes exist, you might also just replace them:
>> parse replace/all {\"Hello world,\" I said} {\"} {"} none
== ["Hello world," "I" "said"] Though that still removes them from the text. You'll probably have to write some rules if that doesn't work for you. << Also, how would I match on a left bracket, any member of a block, and then a right bracket? Here's a pseudocode example: [ FOLLOWED BY ("red" OR "green" OR "blue") FOLLOWED BY "]" >> Can you use block parsing? REBOL can recognize things for you in mnay cases.
>> parse [[blue]] [set b block!]
== true
>> b
== [blue] --Gregg

 [3/15] from: stopm:mediaone at: 14-Feb-2002 16:44


Do you need to escape the quotes (i.e. is there source data that contains
> them)? If not, parse's behavior should be pretty darn close:
No, I am reading in this data from a port.
> Hmmm. If the escapes exist, you might also just replace them: > > >> parse replace/all {\"Hello world,\" I said} {\"} {"} none > == ["Hello world," "I" "said"]
The point is, these escaped quotes should not be treated as grouping constructs, but as normal text. So this: \"Hello world,\" I said should result in [{\"Hello} {world,\"} {I} {Said}].
> Can you use block parsing? REBOL can recognize things for you in mnay
cases. I don't know, I'm trying to find "[" + "red" or "blue" or "green" + "]" within a string, *find* it, not just see if a string matches it. Alex

 [4/15] from: carl:cybercraft at: 15-Feb-2002 11:48


On 15-Feb-02, Gregg Irwin wrote:
> Hi Alex, > << I'm trying to figure out parse... how, for instance, would I use
<<quoted lines omitted: 12>>
> Though that still removes them from the text. You'll probably have > to write some rules if that doesn't work for you.
I took his question to mean that he wanted to transform the likes of this... {"Hello world", he said. \"Indeed you did" I replied.} into this... ["Hello world" , "he" "said" . "Indeed" "you" did" "I" "replied" .] I can't work out the parsing off the top of my head, (sorry - this probably isn't a very useful post), but I'd like to point out that I don't think you can include comers and some other punctuation in blocks like he seemed to want. ie...
>> a: ["a" , "b"]
** Syntax Error: Invalid word -- , ** Near: (line 1) a: ["a" , "b"] doesn't work. Full stops are okay though...
>> a: ["a" . "b"]
== ["a" . "b"]
> << Also, how would I match on a left bracket, any member of a block, > and then a right bracket? Here's a pseudocode example:
<<quoted lines omitted: 6>>
> == [blue] > --Gregg
-- Carl Read

 [5/15] from: greggirwin:mindspring at: 14-Feb-2002 16:00


Hi Alex, << The point is, these escaped quotes should not be treated as grouping constructs, but as normal text. So this: \"Hello world,\" I said should result in [{\"Hello} {world,\"} {I} {Said}]. >> OK, forgive me if I go off-track as I'm trying to understand what you actually want to accomplish. Do you want the escaped chars to remain unchanged, still as escaped chars, or do you want them to be translated into a more normal REBOL form (i.e. unescaped :)? Using the /all refinement with parse, and specifying only a space for the rule, will give you the former.
>> parse/all {\"Hello world,\" I said} " "
== [{\"Hello} {world,\"} "I" "said"]
>> parse/all {\"Hello world,\" "I said"} " "
== [{\"Hello} {world,\"} "I said"] << I'm trying to find "[" + "red" or "blue" or "green" + "]" within a string, *find* it, not just see if a string matches it. >> Sorry, I'm still confused. You can use FIND to find it but my previous example, using block parsing, returns what it found as well.
>> find "this is my [red] car" "[red]"
== "[red] car"
>> parse [this is my [red] car] [some word! set b block! to end]
== true
>> b
== [red] Parse, in this case, finds the block for you and sets the word B to the block it found. Am I getting warmer? --Gregg

 [6/15] from: stopm:mediaone at: 14-Feb-2002 21:04


----- Original Message ----- From: "Gregg Irwin" <[greggirwin--mindspring--com]> To: <[rebol-list--rebol--com]> Sent: Thursday, February 14, 2002 6:00 PM Subject: [REBOL] Re: Another question
> Hi Alex, > << The point is, these escaped quotes should not be treated as grouping
<<quoted lines omitted: 4>>
> actually want to accomplish. Do you want the escaped chars to remain > unchanged, still as escaped chars, or do you want them to be translated
into
> a more normal REBOL form (i.e. unescaped :)? Using the /all refinement
with
> parse, and specifying only a space for the rule, will give you the former.
Ok, never mind, it looks like parse does exactly what I want... thanks. I did want to turn \" into " after breaking it up, however. Odd that parse recognizes backslashes for escaping quotes where the rest of REBOL uses ^ for escaping purposes....
> Sorry, I'm still confused. You can use FIND to find it but my previous > example, using block parsing, returns what it found as well.
<<quoted lines omitted: 6>>
> Parse, in this case, finds the block for you and sets the word B to the > block it found.
Okay, that's nice, but I want to parse real text, not a block. I'm just going to use repeated replace statements here (since the original goal was to replace [red] with an escape character followed by "[31m" for example), but I was hoping to go through the various color tags ([red], [green], [blue] etc.) and search through the remainder of the text after finding a tag regexp-style for the next tag. That way, no time would be wasted re-scanning through the entire string to find new codes... but a repeated replace is okay as well. C'est la vie.
> Am I getting warmer?
More or less. I've found enough of a solution. <g> Alex

 [7/15] from: al:bri:xtra at: 15-Feb-2002 17:34


Alex wrote:
> the original goal was to replace [red] with an escape character followed
by "[31m" for example), but I was hoping to go through the various color tags ([red], [green], [blue] etc.) and search through the remainder of the text after finding a tag regexp-style for the next tag. That way, no time would be wasted re-scanning through the entire string to find new codes That's possible with parse. It just a little trickier to write. [rebol [] Text: "Sample Text with [red]RED, [green]GREEN and [blue]BLUE." parse/all Text [ any [ [ Mark: "[red]" After: ( After: change/part Mark "ESC[31m" After ) :After ] | [ Mark: "[green]" After: ( After: change/part Mark "ESC[31Gm" After ) :After ] | [ Mark: "[blue]" After: ( After: change/part Mark "ESC[31Bm" After ) :After ] | skip ] ] probe Text halt ] And the result: {Sample Text with ESC[31mRED, ESC[31GmGREEN and ESC[31BmBLUE.} I hope that helps! Andrew Martin ICQ: 26227169 http://valley.150m.com/

 [8/15] from: g:santilli:tiscalinet:it at: 15-Feb-2002 11:04


At 03.04 15/02/02, Alex wrote:
>Okay, that's nice, but I want to parse real text, not a block. I'm just >going to use repeated replace statements here (since the original goal was
<<quoted lines omitted: 4>>
>re-scanning through the entire string to find new codes... but a repeated >replace is okay as well. C'est la vie.
Does this help? (Escape sequences are not the real ones I think, I just used them as an example.)
>> normal-text: [to "["] >> escape-seq: [mark1: "[" copy color ["red" | "blue" | "green"] "]" mark2:] >> colors: ["red" "^(1B)[31m" "green" "^(1B)[32m" "blue" "^(1B)[33m"] >> text-to-parse: {
{ This is some [red]text followed by some [blue]text, and so on[green]... { }
>> parse text-to-parse [some [normal-text escape-seq (mark2: change/part mark1 select colors color mark2) :mark2]] >> text-to-parse
== { This is some ^[[31mtext followed by some ^[[33mtext, and so on^[[32m... } Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

 [9/15] from: joel:neely:fedex at: 15-Feb-2002 7:18


Hi, Andrew, Alex, and all, I made the assumption that the list of tokens might be largish and unsuitable for explicit inclusion in the parse rule itself... Andrew Martin wrote:
> Alex wrote: > > the original goal was to replace [red] with an escape character
<<quoted lines omitted: 3>>
> > for the next tag. That way, no time would be wasted re-scanning > > through the entire string to find new codes
Here's my attempt... 8<------------------------------------------------------------ srcrep: make object! [ pairs: [ "red" "pink" "purple" "lavendar" "brown" "yellow" "navy" "periwinkle" "black" "gray" ] token: other: "" rule: [ any [[ began: "[" copy token to "]" "]" ended: ( if found? other: select/skip pairs token 2 [ change/part began first other ended :ended ] )] | skip ] ] run: func [s [string!]] [ parse/all s rule ] ] 8<------------------------------------------------------------ So that (obviously) one could create more of these by saying something like: color-fixer: make srcrep [ pairs: reduce [ "red" "^[[31m" ... ] ] It does this (with my original demo pairs):
>> foo: {
{ Once upon a time, I had: { some [black] pants, { some [brown] shoes, { a [navy] coat, { a [purple] shirt, and { a [red] tie. { Then I left them { too close to the window { on a sunny day! { } == { Once upon a time, I had: some [black] pants, some [brown] shoes, a [navy] coat, a [purple] shirt, and a [red] tie. Then I left...
>> srcrep/run foo
== true
>> print foo
Once upon a time, I had: some gray pants, some yellow shoes, a periwinkle coat, a lavendar shirt, and a pink tie. Then I left them too close to the window on a sunny day! But there's still one puzzle I haven't figured out (probably I'm just overlooking something silly, as I've only had one cup of coffee so far this morning!)
>> baz: {[red] [purple] [brown] [navy] [black] [this is a block]
[red] [23] [navy]} == {[red] [purple] [brown] [navy] [black] [this is a block] [red] [23] [navy]} Just a string with a mixture of expected and non-expected values inside the square brackets...
>> mw: make srcrep [pairs: ["this is a block" "gone!" "23" "42"]] >> mw/run baz
== true
>> baz
== {[red] [purple] [brown] [navy] [black] gone! [red] [23] [navy]}
>> mw/run baz
== true
>> baz
== {[red] [purple] [brown] [navy] [black] gone! [red] 42 [navy]} Notice that I had to apply the modified wordlist twice to get all of the tokens found and replaced!?!? I expected it to do all of them, as in:
>> srcrep/run baz
== true
>> baz
== {pink lavendar yellow periwinkle gray gone! pink 42 periwinkle} with the remainder, or
>> baz: {[red] [purple] [brown] [navy] [black] [this is a block]
[red] [23] [navy]} == {[red] [purple] [brown] [navy] [black] [this is a block] [red] [23] [navy]}
>> srcrep/run baz
== true
>> baz
== {pink lavendar yellow periwinkle gray [this is a block] pink [23] periwinkle} with the original mixed list. Suggestions anyone? What am I overlooking? -jn- -- ; sub REBOL {}; sub head ($) {@_[0]} REBOL [] # despam: func [e] [replace replace/all e ":" "." "#" "@"] ; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"} print head reverse despam "moc:xedef#yleen:leoj" ;

 [10/15] from: al:bri:xtra at: 16-Feb-2002 8:50


Joel wrote:
> Suggestions anyone? What am I overlooking? > began: "[" copy token to "]" "]" ended: (
<<quoted lines omitted: 3>>
> ] > )]
The ":ended" has to be in the parse rule and outside the action part. Also you have to move the 'ended point as well. Like: began: "[" copy token to "]" "]" ended: ( if found? other: select/skip pairs token 2 [ ended: change/part began first other ended ] ):ended] That way the 'parse is set to after the token or the replacement, consistently. I hope that helps! Andrew Martin ICQ: 26227169 http://valley.150m.com/

 [11/15] from: tomc:darkwing:uoregon at: 15-Feb-2002 18:24


>> here: copy "" >> rule: [to "[" mark: "[" [{"red"}|{"green"}|{"blue"}] "]" :kram to end] >> parse erkjhwerjh rjgsradjh e ["green"] riweajf pqwejkfp'okwef} rule
== true
>> print copy/part mark kram
["green"] seems found On Thu, 14 Feb 2002, Alex Liebowitz wrote:

 [12/15] from: joel:neely:fedex at: 15-Feb-2002 21:01


Hi, Andrew, Andrew Martin wrote:
> The ":ended" has to be in the parse rule and outside the action part. Also > you have to move the 'ended point as well. Like:
<<quoted lines omitted: 6>>
> consistently. > I hope that helps!
Absolutely! Thanks! Neeeeeed morrrrrrrre cooooooofffffffeeeeeeee... -jn- -- ; sub REBOL {}; sub head ($) {@_[0]} REBOL [] # despam: func [e] [replace replace/all e ":" "." "#" "@"] ; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"} print head reverse despam "moc:xedef#yleen:leoj" ;

 [13/15] from: al:bri:xtra at: 16-Feb-2002 22:41


Joel wrote:
> Absolutely! Thanks! > > Neeeeeed morrrrrrrre cooooooofffffffeeeeeeee...
:) PS Hi, Joel! Andrew Martin ICQ: 26227169 http://valley.150m.com/

 [14/15] from: g:santilli:tiscalinet:it at: 16-Feb-2002 12:03


At 14.18 15/02/02, you wrote:
>But there's still one puzzle I haven't figured out (probably I'm >just overlooking something silly, as I've only had one cup of >coffee so far this morning!)
You need to do something like: ended: change/part ... and then *after* the paren :ended to let PARSE continue at the correct position. (My code does that, have a look at it.) HTH, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

 [15/15] from: joel:neely:fedex at: 16-Feb-2002 7:53


Hi, Gabriele, Gabriele Santilli wrote:
> (My code does that, have a look at it.) >
Reading and typing too fast, dropped a stitch. Thanks! -jn- -- ; sub REBOL {}; sub head ($) {@_[0]} REBOL [] # despam: func [e] [replace replace/all e ":" "." "#" "@"] ; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"} print head reverse despam "moc:xedef#yleen:leoj" ;

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted