Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Problem with parse

 [1/5] from: maillist::peter::home::se at: 30-Oct-2008 13:52


Hello! I have a databasefile built up by lines with the following structure: {<date>|<string>|<url> I parse each line using: parse/all <line> "|" This normally works as expected but with the second line below it seems like the parse does something wrong. Or do I miss something?
>> parse/all {2008-10-30|This is OK|http://www.example.com} "|"
== ["2008-10-30" "This is OK" "http://www.example.com"]
>> parse/all {2008-10-30|This "is" OK|http://www.example.com} "|"
== ["2008-10-30" {This "is" OK} "http://www.example.com"]
>> parse/all {2008-10-30|"This is" NOK|http://www.example.com} "|"
== ["2008-10-30" "This is" " NOK" "http://www.example.com"] It seems that the problem is when a | is directly followed by a ". Anyone with a solution? Best regards, Peter Carlsson

 [2/5] from: brock:kalef:innovapost at: 30-Oct-2008 9:28


Peter, I'm a little confused as to which of the results you don't want. You indicate the second line, but you state in the example that it's "OK". I am not sure if that is a bug or not, but you can simply work around it by doing a replace before doing the parse. The only drawback is you will also have to do a replace to get the " back if they are needed. test: parse/all replace/all {2008-10-30|"This is" NOK|http://www.example.com} {"} {~} "|" Here, the replace <your string> {"} {~} will occur prior to the parse. == ["2008-10-30" {~This is~ NOK} "http://www.example.com"] The end result will need to have the ~ character replaced with " if they are essential. replace/all test/2 {~} {"} == {"This is" NOK} Or you can do the replace directly into the block like so...
>> test/2: replace/all test/2 {~} {"}
== {"This is" NOK}
>> test
== ["2008-10-30" {"This is" NOK} "http://www.example.com"] I hope this helps work around the problem. Brock

 [3/5] from: pwawood:g:mail at: 30-Oct-2008 22:10


Hello Peter! I don't have the answer to why you're third case doesn't work but have this parse-it-yourself solution that seems to give the result you are seeking. I'm sure there are many better ways to do what you want and hopefully others will chip in with them.
>> parse-rule: [
[ 2 [copy data to "|" skip (insert/only tail parse-output data)] [ copy data to end (insert/only tail parse-output data) [ ] You will need to use it like this:
>> parse-output: copy []
== []
>> line: {2008-10-30|This is OK|http://www.example.com}
== "2008-10-30|This is OK|http://www.example.com"
>> parse/all line parse-rule
== true
>> head parse-output
== ["2008-10-30" "This is OK" "http://www.example.com"]
>> parse-output: copy []
== []
>> line2: {2008-10-30|This "is" OK|http://www.example.com}
== {2008-10-30|This "is" OK|http://www.example.com}
>> parse/all line2 parse-rule
== true
>> head parse-output
== ["2008-10-30" {This "is" OK} "http://www.example.com"]
>> parse-output: copy []
== []
>> line3: {2008-10-30|"This is" NOK|http://www.example.com}
== {2008-10-30|"This is" NOK|http://www.example.com}
>> parse/all line3 parse-rule
== true
>> head parse-output
== ["2008-10-30" {"This is" NOK} "http://www.example.com"] Regards Peter On 30 Oct 2008, at 20:52, Peter Carlsson wrote:

 [4/5] from: maillist::peter::home::se at: 30-Oct-2008 15:10


On Thu, Oct 30, 2008 at 09:28:25AM -0400, Brock Kalef wrote: Hello! I'm really sorry that I confused you (and everybody else for that matter) by stating that the second line was NOK. I ment the third line. Both first and second line is OK. Thanks for your reply! /Peter

 [5/5] from: tom::conlin::gmail::com at: 30-Oct-2008 10:46


Sigh... I can remember this coming up almost almost a decade ago. the answer I would prefer is, we, as users of rebol would be able to completely specify the internal 'stop chars used by parse in string splitting mode. The answer I have is don't use parse/all for simple string splitting . Use parse/all but create your own rule block. note: I find "terminating" preferable to "separating" that is : <date>|<string>|<url>| over <date>|<string>|<url> then a rule such as: rule: [(blk: clear blk) 3[copy token to "|"(insert tail blk token) "|"] (blk)] Peter Carlsson wrote: