Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Need help from parsing professionals

 [1/6] from: moeller_thorsten:gmx at: 27-Jul-2001 12:36


Hi, it is obviously not a big thing. I have a big string. from this string i need to make a readable file with lots of lines. So every line starts with AA...... The Line ends when the next Line with AA comes up. What i tried now was following: a: read %/c/temp/test.txt parse a [any [to "AA201" mark: to "AA" (write/append %/c/temp/new.txt join mark "^/")]] This seems to run in an endless loop and looking in the new.txt it shows, that the parser didn't recognise the "AA201", because he always adds the whole big string in one part, not the parts of the string as lines. Here are some sample lines for testingny ideas??? Thorsten

 [2/6] from: sqlab:gmx at: 27-Jul-2001 12:48


Hi Thorsten
> Hi, > it is obviously not a big thing.
<<quoted lines omitted: 8>>
> that the parser didn't recognise the "AA201", because he always adds the > whole big string in one part, not the parts of the string as lines.
At first you should advance the focus as in to "AA201" mark: skip 2 to "AA" mark sets only a pointer in the string, so your next write/append appends always from this point until the end of the string. Of course you could use a copy to or a copy/part with a second pointer. But why do you not just replace/all next a "AA201" "^/AA201" ? AR AR -- GMXler aufgepasst - jetzt viele 1&1 New WebHosting Pakete ohne Einrichtungsgebuehr + 1 Monat Grundgebuehrbefreiung! http://puretec.de/index.html?ac=OM.PU.PU003K00717T0492a

 [3/6] from: chris::starforge::demon::co::uk at: 27-Jul-2001 12:58


Thorsten Moeller wrote:
> Any ideas???
Just out of interest ('cos I'm a nosey person) what was that? Looked amost like protien sequence informtion or something... Chris the Wildly Inaccurate -- .------{ http://www.starforge.co.uk }-----. .--------------------------. =[ Explorer2260, Designer and Coder \=\ P: TexMaker, ROACH, site \ =[___You_will_obey_your_corporate_masters___]==[ Stack: EETmTmTRRSS------ ]

 [4/6] from: joel:neely:fedex at: 27-Jul-2001 2:02


Hi, Thorsten, I'm not a parsing professional (and I don't play one on TV ;-) but maybe I can make a couple of suggestions... Thorsten Moeller wrote:
> I have a big string. from this string i need to make a > readable file with lots of lines. So every line starts > with AA...... > The Line ends when the next Line with AA comes up. >
Here's one way, using the sample data from your post
>> foo
== {AA2012001070107120006600000300002DIV 1015940000000000000000000000000001106725000010711010711AR 000 BUCHUNG TS1000000001301,84...
>> recs: []
== []
>> parse foo [any [
[ "AA" copy rec to "AA" (append recs join "AA" rec) | [ "AA" copy rec to end (append recs join "AA" rec) [ ]] == true
>> length? recs
== 14
>> recs
== [{AA2012001070107120006600000300002DIV 1015940000000000000000000000000001106725000010711010711AR 000 BUCHUNG TS1000000001301,8... after which all of your "records" are individual strings in the block RECS. While I think that's the closest to your original request, I can't help adding a couple of others, just for fun. This version requires the least knowledge of PARSE, but doesn't show off its power either:
>> bletch: parse/all foo "A"
== ["" "" {2012001070107120006600000300002DIV 1015940000000000000000000000000001106725000010711010711} {R 000 BUCHUNG TS100000000... Discard the first element of the block (it's whatever precedes the first "A"). After that, every empty string represents a record boundary (the 0-length gap between the #"A"s in "AA"). Why would anyone write it this way? I had a file some time back where two-character sequences marked both record *and* field boundaries. This approach would allow you to process the individual fields within the records in order. Another option is
>> parse foo ["AA" copy rec to "AA" copy rest to end]
== true
>> rec
== {2012001070107120006600000300002DIV 1015940000000000000000000000000001106725000010711010711AR 000 BUCHUNG TS1000000001301,84+0...
>> rest
== {AA2012001070107120006600000200001SAB 0000081080000000000000000000000001106724000010711010711AR 000 GEGENB. AU2000000012419,49... which (as long as PARSE returns TRUE) lets you slurp the first record from the string, leaving the rest in ... REST. Why would anyone write it this way? You can put the above in a loop if you only need to process the first few records (e.g. find the first record with a specified set of criteria). HTH! -jn- -- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

 [5/6] from: sqlab:gmx at: 27-Jul-2001 14:30


> Hi Thorsten > > Hi,
<<quoted lines omitted: 20>>
> At first you should advance the focus as in > to "AA201" mark: skip 2 to "AA"
As I misplaced the parameter to skip, I will deliver a correct version b: a: read %/c/temp/test.txt parse next a [ some [to "AA201" a: ( write/append/lines %/c/temp/new.txt copy/part b a ) b: 4 skip ] ] AR
> mark sets only a pointer in the string, so your next write/append appends > always from this point until the end of the string. > > Of course you could use a copy to or a copy/part with a second pointer. > But why do you not just replace/all next a "AA201" "^/AA201" ? >
-- GMXler aufgepasst - jetzt viele 1&1 New WebHosting Pakete ohne Einrichtungsgebuehr + 1 Monat Grundgebuehrbefreiung! http://puretec.de/index.html?ac=OM.PU.PU003K00717T0492a

 [6/6] from: moeller_thorsten:gmx at: 30-Jul-2001 9:20


Hi to all the helpers out there, thanks for all the suggestions. They helped me out and encouraged me to go further with REBOL. Thanks Thorsten

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted