Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

PARSE question

 [1/23] from: dvydra2::yahoo::com at: 30-Mar-2001 9:56


We are parsing large files as blocks. If there is an error somewhere, how can we find out where the error was? Can we get a count of items parsed correctly? Thanks dv ===== please reply to: [david--vydra--net]

 [2/23] from: petr:krenzelok:trz:cz at: 30-Mar-2001 20:32


----- Original Message ----- From: "David Vydra" <[dvydra2--yahoo--com]> To: <[rebol-list--rebol--com]> Sent: Friday, March 30, 2001 7:56 PM Subject: [REBOL] PARSE question
> We are parsing large files as blocks. If there is an > error somewhere, how can we find out where the error > was? Can we get a count of items parsed correctly?
I am not sure I correctly understand what you are actually asking for, but you can always "mark" your input to some word .... which is available in context out of parse block too ... e.g. parse something [some [pos1: some-stuff | pos2: other-stuff] exit: to end] print [index? pos1 index? pos2 index? exit] So I think that if error occures, then by printing your marking words you can find out, where in the string you got lost ... -pekr-

 [3/23] from: agem:crosswinds at: 31-Mar-2001 17:35


>We are parsing large files as blocks. If there is an >error somewhere, how can we find out where the error >was? Can we get a count of items parsed correctly? >
yes parse something [any[ before-the-try: some parsing (counter: counter + 1) ]] in 'before-the-try you get the parse-position before the failed try, in counter the count of trys.

 [4/23] from: gchiu:compkarori at: 27-Jun-2001 13:29


If parse always returns a string eg: parse {a="b"} [ thru {a="} copy test to {"} ] and 'test is now a string containing "b" shouldn't parse {a=""} [ thru {a="} copy test to {"} ] 'test now be an empty string rather than type none! ? -- Graham Chiu

 [5/23] from: brett:codeconscious at: 27-Jun-2001 12:18


Hi Graham,
> If parse always returns a string
USAGE: PARSE input rules /all /case If rules is a block parse returns true or false. Just being pedantic :)
> eg: parse {a="b"} [ thru {a="} copy test to {"} ] > > and 'test is now a string containing "b" > > shouldn't > > parse {a=""} [ thru {a="} copy test to {"} ] > > 'test now be an empty string rather than type none! ? >
I guess is depends on how you see it, either as "nothing to copy" (none!) or copied a string of zero length (empty). Also, I don't see that it makes a lot of difference - except for perhaps some more if statements in some circumstances. Brett

 [6/23] from: joel:neely:fedex at: 26-Jun-2001 18:42


Hi, Brett, I must agree with Graham on this one. A zero-length string is a completely legitimate value, but most definitely is *not* the same thing as NONE! Brett Handley wrote:
> Hi Graham, > > > If parse always returns a string >
I think he meant "sets variables to string! values using COPY" instead of "returns"...
> > shouldn't > >
<<quoted lines omitted: 6>>
> Also, I don't see that it makes a lot of difference - except > for perhaps some more if statements in some circumstances.
Any time you wish to build a new string from the parsed substrings, or print their content, you have a problem with NONE! instead of an empty string. For example, given: addr1: "John Doe/123 Lonely St/Suite 16/Los Angeles/CA/90210" addr2: "Joe Doaks/321 Hilltop Ln//Green Mtn/MA/02187" print-address-label: function [ s [string!] ][ addr-name addr-line1 addr-line2 addr-city addr-state addr-ZIP ][ parse/all s [ copy addr-name to "/" skip copy addr-line1 to "/" skip copy addr-line2 to "/" skip copy addr-city to "/" skip copy addr-state to "/" skip copy addr-ZIP to end ] print [ addr-name newline addr-line1 newline addr-line2 newline addr-city " " addr-state " " addr-ZIP newline ] ] ... we get ...
>> print-address-label addr1
John Doe 123 Lonely St Suite 16 Los Angeles CA 90210
>> print-address-label addr2
Joe Doaks 321 Hilltop Ln none Green Mtn MA 02187 Clearly the address label should contain a blank line between 321 Hilltop Ln and "Green Mtn Ma 02187". This is only a trivial example; I'm not suggesting this is the best way to represent addresses or print labels! If the example is too trite, just pretend that the syntax of the address were much more complex, with various punctuation, etc. I've done a tremendous amount of text-flogging over the years, converting address lists, reformatting data files among a variety of representations, etc. In such applications, it's very common to parse out the pieces of one format and immediately construct a new string or write/print using some combination of the just-parsed text fields. Yes, I know it's possible to follow the PARSE statement with a list of "fix-the-empty-strings" statements, as in parse/all s [ ;... ] addr-name: any [addr-name ""] addr-line1: any [addr-line1 ""] addr-line2: any [addr-line2 ""] ;... print [ ;... ] (or even to make a fix-up function and call it on all of the fields), but all that bother hardly seems in keeping with "make easy things easy and hard things possible" IMHO. -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

 [7/23] from: brett:codeconscious at: 27-Jun-2001 16:20


Hi Joel,
> I must agree with Graham on this one. A zero-length string > is a completely legitimate value, but most definitely is *not* > the same thing as NONE!
You're phrasing runs the risk of implying I stated otherwise - which is not the case. Though now you've piqued my interest - what is value that is not completely legitamate - illegitmate or partially legitimate? Are they something like complex numbers - a real part and some other part? Just kidding.
> I think he meant "sets variables to string! values using > COPY" instead of "returns"...
Yeah...
> > > > > shouldn't
<<quoted lines omitted: 12>>
> substrings, or print their content, you have a problem with > NONE! instead of an empty string. For example, given:
.. <snipped useful parse tutorial material> ..
> I've done a tremendous amount of text-flogging over the years, > converting address lists, reformatting data files among a
<<quoted lines omitted: 4>>
> Yes, I know it's possible to follow the PARSE statement with > a list of "fix-the-empty-strings" statements, as in
<snipped more useful tutorial code>
> (or even to make a fix-up function and call it on all of > the fields), but all that bother hardly seems in keeping > with "make easy things easy and hard things possible" IMHO.
Well you've fleshed out the circumstances I referred to. I agree that if parse behaved the other way your example and similar code would be far simpler. It would be more elegant in the sense that when parsing strings one only would need to think of strings "containing" substrings and strings of zero length. So yes I guess it does make a difference, because currently parse sort of "implies" none! is part of the string being parsed - which is inconsistent with the knowledge that strings should only contain character values. It needs to be said now, that this is only relevent to parsing string! values. When parse is applied to a block! value an empty string is a completely legitimate value of a block. So are empty blocks. So parse still needs to set a variable to NONE! during a COPY when parsing a block otherwise we lose information. So how many scripts would such a change break? Maybe not many, but it would be a bit like y2k, you have to do a lot of hunting to determine the impact. Would I support the change? - yep. Brett.

 [8/23] from: gchiu:compkarori at: 27-Jun-2001 18:19


On Tue, 26 Jun 2001 18:42:47 -0500 Joel Neely <[joel--neely--fedex--com]> wrote:
> I must agree with Graham on this one. A zero-length > string > is a completely legitimate value, but most definitely is > *not* > the same thing as NONE!
Trouble is I guess we are stuck with this behaviour as changing it might cause problems with legacy code :-( -- Graham Chiu

 [9/23] from: gjones05:mail:orion at: 27-Jun-2001 5:52


Hi, Graham, Brett, Joel, I don't seem to be able to muster the skills to articulate an argument one way or the other, which is just as well, because you would present a formidable opponents. :-) So, when I don't feel up to a good argument, I simply present a different way of thinking about the "problems" at hand and will leave to each individual user whether the approach is useful.
>From "Graham Chiu"
...
> parse {a="b"} [ thru {a="} copy test to {"} ]
... and ...
> parse {a=""} [ thru {a="} copy test to {"} ]
Pardon me if I am not using the terminology correctly, but to most simply coerce the result to a specific type, here was how I saw the problem: parse to-block {a="b"} ['a= set t string! to end] ; == true t ; == "b" parse to-block {a=""} ['a= set t string! to end] ; == true t ; == "" From: "Joel Neely" ...
> Any time you wish to build a new string from the parsed > substrings, or print their content, you have a problem with > NONE! instead of an empty string. For example, given:
... I understand that your example was presented as a piece of an argument meant to demonstrate a concrete side effect of the behavior in question. Here was how I saw the problem: addr1: "John Doe/123 Lonely St/Suite 16/Los Angeles/CA/90210" addr2: "Joe Doaks/321 Hilltop Ln//Green Mtn/MA/02187" print-address-label: function [ s [string!] ][ addr-name addr-line1 addr-line2 addr-city addr-state addr-ZIP ][ foreach [addr-name addr-line1 addr-line2 addr-city addr-state addr-zip] parse/all s "/" [ print [ addr-name newline addr-line1 newline addr-line2 newline addr-city addr-state addr-ZIP newline ] ] ]
>> print-address-label addr1
John Doe 123 Lonely St Suite 16 Los Angeles CA 90210
>> print-address-label addr2
Joe Doaks 321 Hilltop Ln Green Mtn MA 02187 I guess a "blank" line is better than a "none" line for this trivial ("no-additional-logic-added") example. Remember, Joel, (et al,) "It's turtles all the way down!" (I had to look that one up, BTW. If I had ever heard it before, I had certainly forgotten it. :-) --Scott Jones

 [10/23] from: agem:crosswinds at: 27-Jun-2001 14:37


RE: [REBOL] Re: parse question [gchiu--compkarori--co--nz] wrote:
> On Tue, 26 Jun 2001 18:42:47 -0500 > Joel Neely <[joel--neely--fedex--com]> wrote:
<<quoted lines omitted: 6>>
> Trouble is I guess we are stuck with this behaviour as > changing it might cause problems with legacy code :-(
could be some option like parse/empties ? -Volker

 [11/23] from: joel:neely:fedex at: 27-Jun-2001 2:55


Brett Handley wrote:
> Hi Joel, > > > I must agree with Graham on this one. A zero-length > > string is a completely legitimate value, but most > > definitely is *not* the same thing as NONE! > > You're phrasing runs the risk of implying I stated > otherwise - which is not the case. >
Ooops! Sorry! I intended it as a part of my "thinking out loud" about "" vs. NONE! and not as an attempt to describe your views.
> Though now you've piqued my interest - what is value that > is not completely legitamate - illegitmate or partially > legitimate? Are they something like complex numbers - a real > part and some other part? Just kidding. >
Hmmm. I gotta stop writing emails late at night! ;-) By analogy, -1 is a legitimate integer, but one would likely have problems trying to use it to PICK data from a series. However, most uses one makes of string data accept "" without complaint. That was the sense I intended.
> Well you've fleshed out the circumstances I referred to. I > agree that if parse behaved the other way your example and
<<quoted lines omitted: 6>>
> inconsistent with the knowledge that strings should only > contain character values.
Agreed. And my view is that the second point you made has a much higher priority than the first. It addresses both consistency and learnability. (Not that you've ever heard anything from me on those topics before... ;-) Especially since PARSE really does know how to extract strings of zero length in other settings:
>> addr2: "Joe Doaks/321 Hilltop Ln//Green Mtn/MA/02187"
== "Joe Doaks/321 Hilltop Ln//Green Mtn/MA/02187"
>> parse/all addr2 "/"
== ["Joe Doaks" "321 Hilltop Ln" "" "Green Mtn" "MA" "02187"]
> It needs to be said now, that this is only relevent to parsing > string! values... >
It seems that parse needs to be *able* to supply NONE! values when appropriate, but I'm not sure I completely understand what the phrase "when appropriate" might mean. Given that phone data may or may not be available for all people, we could design several possible representations of a data structure which allows for the data-not-available case. For example: When phone data is available (the easy case) demo: [1234 "Ferd Burfel" #901-555-1212 127.0.0.1] When it is not available, we can choose to a) omit it omed: [2345 "Joe Doaks" 127.255.255.255] b) mark it with a special word medo: [3456 "Jane Doe" none 255.255.255.255] c) or mark it with a specific not-available value odem: reduce medo Assuming we're handling the simple case in the obvious way: parse demo [ copy xID integer! copy xName string! copy xPhone issue! copy xIP tuple! ] ... each of the not-available choices implies a different strategy for block parsing. parse omed [ copy xID integer! copy xName string! copy xPhone [issue! | none] copy xIP tuple! ] parse medo [ copy xID integer! copy xName string! copy xPhone [issue! | word!] copy xIP tuple! ] parse odem [ copy xID integer! copy xName string! copy xPhone [issue! | none!] copy xIP tuple! ] The programmer can choose which representation/parsing strategy to use, because the block is probably a constructed artifact. I think we're agreeing here; I'm just thinking out loud to confirm that fact...
> So how many scripts would such a change break? Maybe not many, > but it would be a bit like y2k, you have to do a lot of > hunting to determine the impact. > > Would I support the change? - yep. >
Agreed. Especially if the change would help make the language more attractive/understandable/usable to a larger audience. -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

 [12/23] from: joel:neely:fedex at: 27-Jun-2001 3:16


Hi, Scott, GS Jones wrote:
> Hi, Graham, Brett, Joel, > > I don't seem to be able to muster the skills to articulate an > argument one way or the other, which is just as well, because > you would present a formidable opponents. :-) >
No "opponents" here... We're all just turtles on this bus! ;-) -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

 [13/23] from: robbo1mark:aol at: 27-Jun-2001 10:13


Joel, this is offtopic & offbeat, but I'm just curious, you seem to mention "turtles" quite a lot! any reason? we're you a turtle in a previous life? are you a turtle now? maybe you programmed in LOGO years ago I don't know? just curious & having fun. Mark Dickson In a message dated Wed, 27 Jun 2001 9:51:21 AM Eastern Daylight Time, Joel Neely <[joel--neely--fedex--com]> writes: << Hi, Scott, GS Jones wrote:
> Hi, Graham, Brett, Joel, > > I don't seem to be able to muster the skills to articulate an > argument one way or the other, which is just as well, because > you would present a formidable opponents. :-) >
No "opponents" here... We're all just turtles on this bus! ;-) -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

 [14/23] from: brett:codeconscious at: 28-Jun-2001 1:09


> Hi, Graham, Brett, Joel, > > I don't seem to be able to muster the skills to articulate an argument one
way
> or the other, which is just as well, because you would present a
formidable
> opponents. :-)
I'm the one in the corner looking dazed after thumping myself in the head a few times... Brett.

 [15/23] from: joel:neely:fedex at: 27-Jun-2001 10:16


Hi, Mark, [Robbo1Mark--aol--com] wrote:
> Joel, > > this is offtopic & offbeat, but I'm just curious, > you seem to mention "turtles" quite a lot! any reason? >
Old joke: Prominent scientist gives a lecture on the space program. Little old lady comes up afterwards to speak with him. LOL: That was all very interesting, but it's nonsense. Everybody knows that the earth is flat and rests on the back of a giant turtle. PS [humoring her]: Well, what does that turtle stand on? LOL: Each of his feet is on the back of another giant turtle. PS: Well, what do THEY stand on? LOL: Now, young man, you can't trick me. From there, it's turtles all the way down! Somehow it seemed appropriate in the context of Language X implemented in Language Y compiled to Assembler implemented in Microcode which manipulates Function blocks implemented in Solid state circuitry dependent on Quantum mechanics ... ;-) -jn- It's turtles all the way down! joel'dot'neely'at'fedex'dot'com

 [16/23] from: lmecir:mbox:vol:cz at: 27-Jun-2001 17:55


Hi Brett, there are some "partially legitimate" values in Rebol, IMO :-) I would say that about the values of ERROR! and UNSET! datatypes.

 [17/23] from: robbo1mark:aol at: 27-Jun-2001 12:39


Joel, Aha! now I see & understand! I prefer Terrapins myself, much more cute IMHO. I seem to remember a thread on this list ages & ages ago about Which Animal? should O'Reiily use for some future O'Reilly Safari book, which would formally make REBOL "recognized officially" as a language. Would it be a turtle? I can't remember which animal, if anything, was selected? Anybody else got a better memory? cheers, Mark Dickson PS If this debate wasn't settled then by all means lets start it all up again! In a message dated Wed, 27 Jun 2001 11:31:14 AM Eastern Daylight Time, Joel Neely <[joel--neely--fedex--com]> writes: << Hi, Mark, [Robbo1Mark--aol--com] wrote:
> Joel, > > this is offtopic & offbeat, but I'm just curious, > you seem to mention "turtles" quite a lot! any reason? >
Old joke: Prominent scientist gives a lecture on the space program. Little old lady comes up afterwards to speak with him. LOL: That was all very interesting, but it's nonsense. Everybody knows that the earth is flat and rests on the back of a giant turtle. PS [humoring her]: Well, what does that turtle stand on? LOL: Each of his feet is on the back of another giant turtle. PS: Well, what do THEY stand on? LOL: Now, young man, you can't trick me. From there, it's turtles all the way down! Somehow it seemed appropriate in the context of Language X implemented in Language Y compiled to Assembler implemented in Microcode which manipulates Function blocks implemented in Solid state circuitry dependent on Quantum mechanics .. ;-) -jn- It's turtles all the way down! joel'dot'neely'at'fedex'dot'com

 [18/23] from: gchiu:compkarori at: 28-Jun-2001 8:48


On Wed, 27 Jun 2001 10:16:20 -0500 Joel Neely <[joel--neely--fedex--com]> wrote:
> Old joke: > Prominent scientist gives a lecture on the space
<<quoted lines omitted: 12>>
> it's > turtles all the way down!
In "A Brief History of Time" by Stephen Hawking, 1988, page 1, the scientist in this tale is said to be Bertrand Russell giving a public lecture on astronomy. -- Graham Chiu

 [19/23] from: joel:neely:fedex at: 27-Jun-2001 16:15


Hi, Graham, You are a gentleman and a scholar! Graham Chiu wrote:
> > Old joke: > >
<<quoted lines omitted: 3>>
> 1, the scientist in this tale is said to be Bertrand Russell > giving a public lecture on astronomy.
Since you are clearly so well-read, perhaps you'd know a hint as to the origin of one of my favorite (but, alas, unattributed) quotations: "You can only learn that which you already *almost* know." Thanks! -jn- -- It's turtles all the way down! joel'dot'neely'at'fedex'dot'com

 [20/23] from: gchiu:compkarori at: 28-Jun-2001 16:06


On Wed, 27 Jun 2001 16:15:38 -0500 Joel Neely <[joel--neely--fedex--com]> wrote:
> Since you are clearly so well-read, perhaps you'd know a > hint
<<quoted lines omitted: 3>>
> "You can only learn that which you already *almost* > know."
Sorry Joel, can't help! As for well read, it just so happened that my sister mentioned last week that she had borrowed this book from the library. I got my copy out in case she needed it longer, and happened to re-read the first page wherein lay the tale. Now, if it had been on page 2, I wouldn't have seen it! -- Graham Chiu

 [21/23] from: robert:muench:robertmuench at: 29-Jun-2001 13:17


> -----Original Message----- > From: [rebol-bounce--rebol--com] [mailto:[rebol-bounce--rebol--com]]On Behalf Of
<<quoted lines omitted: 8>>
> parse {a=""} [ thru {a="} copy test to {"} ] > 'test now be an empty string rather than type none! ?
Hi, I see the whole string "" as the definition for the empty string and there is nothing between " and ", therefore the none! return value is OK as you read thru the first " and up-to the next ". Robert

 [22/23] from: ingo:2b1 at: 3-Jul-2001 12:38


Hi Robert, Once upon a time Robert M. Muench spoketh thus:
> > -----Original Message----- > > From: [rebol-bounce--rebol--com] [mailto:[rebol-bounce--rebol--com]]On Behalf Of
<<quoted lines omitted: 10>>
> is nothing between " and ", therefore the none! return value is OK as you read > thru the first " and up-to the next ". Robert
We might discuss the "validity" of this approach in length, but from a practical viewpoint returning "" instead of none is much preferrable. Returning none: - If it doesn't matter to you, wether the string is empty or not, you have to manually check all values, and change them to "" - if it matters you may check for none? values Returning "": - If it doesn't matter to you, wether the string is empty or not, you don't have to do any thing - if it matters you may check for empty? values => all in all returning empty strings requires less programming efforts kind regards, Ingo

 [23/23] from: robert:muench:robertmuench at: 4-Jul-2001 8:56


> -----Original Message----- > From: [rebol-bounce--rebol--com] [mailto:[rebol-bounce--rebol--com]]On Behalf Of
<<quoted lines omitted: 6>>
> ... > => all in all returning empty strings requires less programming efforts
Hi, from the pragmatic POV I agree absolutely ;-)). Robert

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted