More Parsing questions

[1/6] from: greggirwin::starband::net at: 7-Sep-2001 9:58

Thanks for jumping in Ladislav. I'll be adding that to my notebook! --Gregg

[2/6] from: lmecir:mbox:vol:cz at: 7-Sep-2001 10:24

You can try this: to-rule: function [ {generate a to A parse rule} a [block!] ] [cont finish] [ reduce [ 'any reduce [ reduce [ a to paren! [cont: [to end skip] finish: []] '| to paren! [cont: [] finish: [to end skip]] 'skip ] 'cont ] 'finish ] ] comment { Example #1: to-space-or-br: to-rule [" " | "<br>"] result: "" parse/all "aa" [to-space-or-br copy result to end] probe result parse/all "a a<br>" [to-space-or-br copy result to end] probe result parse/all "ab<br> " [to-space-or-br copy result to end] probe result Example #2: digit: charset [#"0" - #"9"] four-digit: [4 digit] to-four-digit: to-rule four-digit tfd: [to-four-digit copy fd four-digit to end] parse/all "abcd 1234" tfd probe fd ; == "1234" } Regards Ladislav P.S. Other functions that may be of some use are: a-b: function [ {Generate an A-B parse rule} a [block!] b [block!] ] [finish] [ reduce [ reduce [ b to paren! [finish: [to end skip]] '| to paren! reduce [first [finish:] a] ] 'finish ] ] comment { Example: a: [any "a" "b"] b: ["aa"] parse "ab" a-b a b parse "aab" a-b a b } not-rule: function [ {Generate a not A parse rule} a [block!] ] [finish] [ reduce [ reduce [ a to paren! [finish: [to end skip]] '| to paren! [finish: []] ] 'finish ] ] comment { Example: a: [any "a" "b"] parse "ab" not-rule a parse "b" not-rule a parse "" not-rule a }

[3/6] from: pablohar:ho:tmail at: 6-Sep-2001 11:06

Hi Stefan.. I'm not a parse guru but try this... digit: charset [#"0" - #"9"] alpha: charset [#"A" - #"Z" #"a" - #"z"] alphanum: union alpha digit Rule: [any alpha 4 digit any alpha] It's parsing my examples only for 4 digits (not more not less) and works

[4/6] from: greggirwin:starband at: 6-Sep-2001 11:54

Hi Stefan, In writing my little RegEx function (posted a few days ago) I found that you can use a bitset as a target for 'to or 'thru. I played around with a few things here and didn't find a quick solution but I'll try to find some time later to investigate. If we can't find a parse solution, it shouldn't be too tough just to tokenize and parse the line yourself in this case. --Gregg

[5/6] from: syke:amigaextreme at: 6-Sep-2001 7:04

Hi list, as usual, I'm having parsin troubles. Here's what I'm trying to do: I have a regular text file containing names, with phone numbers, like this: John Doe 5694 Jane Doe 0445-38352 etc.. (I read it in using read/lines) what I want to do is to find all the lines that only have four digits in them, the other ones should be ignored. So I tried this:

>> data: "Ebba Gr�n 2345"

== "Ebba Gr�n 2345"

>> parse data [to "2345" to end]

== true And after that, I tried this:

>> digit: charset "0123456789"

== make bitset! #{ 000000000000FF03000000000000000000000000000000000000000000000000 }

>> parse data [to 4 digit to end]

== false Why does this become false? What I want to do is, if parse returns true, it should copy the entire line, add a newline to that line, and then save the file. I realise that this parse rule will find lines containing for example 6 digits in a row, but I guess it's a start. This is what I'd like to achieve: foreach line data [ if (parse line [to 4 digit to end]) [append new-data (rejoin [ line newline])] ] Or something similar, the trick is to get the lines that contain only 4 digits, not the ones containing 5 or more digits (no matter what order they are in, xxx-xxxxxxx, xxxxxxx, or anything else). /Regards Stefan Falk www.amigaextreme.com

[6/6] from: petr:krenzelok:trz:cz at: 6-Sep-2001 7:27

Stefan Falk wrote:

> Hi list, > as usual, I'm having parsin troubles. Here's what I'm trying to do:

<<quoted lines omitted: 16>>

> >> parse data [to 4 digit to end] > == false

I don't use parse too much, so I am no parse guru, but: data: "Ebba Gr�n 2345" digit: charset [#"0" - #"9"] ; but your solution is equivalent :-) ... alpha: complement digit parse/all data [some [alpha | digit]] but it will return true even in case of e.g. "1234-123454656" so you could turn it into sequence: parse/all data [some alpha some digit end] --------------- ->> data: "Ebba Gr�n 2345" == "Ebba Gr�n 2345" ->> parse/all data [some alpha some digit end] == true --------------- ->> data: "Ebba Gr�n 2345-1234556" == "Ebba Gr�n 2345-1234556" ->> parse/all data [some alpha some digit end] == false imo you can't use "to digit", as parser seems to me as matching engine, so you can skip only to known string, not some bitset, but I can be wrong. I don't know what is your exact data structure, but you could also: parse/all data " ", so if your data are space separated, you will end-up with following block: ->> parse/all data " " == ["Ebba" "Gr�n" "2345"] then, instead of using parse rules, some easy condition will work for you: if not found? find last parse/all data " " "-" [print "data OK"] hope-this-helps, -pekr-

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted