Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Need Some PARSE Help

From: gerardcote:sympatico:ca at: 17-Oct-2005 20:28

Hi everybody, in an effort to augment the interest of a friend for REBOL I recently tried to create a simple datamining app that could analyze theatre information about films presentation days and hours. The site from which I retrieve the information comes from the french site http://cinemaquebec.com). In fact for the moment my biggest problem come from the fact that I don't fully understand the way PARSE works when it encounters newline characters. Let me give a simplified example extracted from the site to illustrate my point: t4: { Fri.: 1:00, 3:00, 7:00 Sat., sun., mon., tue., wed., thu.: 10:00am, 1:00, 3:00, 9:00, 10:00} Here we have one day (Fri.) followed by a colon(:) followed again by 3 times. Right after this cycle is done again with not one but 6 days separated by (,) again followed by a colon (:) and 5 other times. I wrote a block of relatively simple rules that apply well against this simple example. Here is the result I get from the parse:
>> parse t4 rules2/expr
which-day: "Fri." 4 Hour: "1" 1 Min: "00" 2 which-hour: " 1:00" 5 Hour: "3" 1 Min: "00" 2 which-hour2: " 3:00" 5 Hour: "7" 1 Min: "00" 2 which-hour2: " 7:00 " 6 which-days: "Fri.: 1:00, 3:00, 7:00 " 23 which-day: "Sat." 4 which-day2: " sun." 5 which-day2: " mon." 5 which-day2: " tue." 5 which-day2: " wed." 5 which-day2: " thu." 5 Hour: "10" 2 Min: "00" 2 which-hour: " 10:00" 6 Hour: "1" 1 Min: "00" 2 which-hour2: " 1:00" 5 Hour: "3" 1 Min: "00" 2 which-hour2: " 3:00" 5 Hour: "9" 1 Min: "00" 2 which-hour2: " 9:00" 5 Hour: "10" 2 Min: "00" 2 which-hour2: " 10:00" 6 which-days2: {Sat., sun., mon., tue., wed., thu.: 10:00am, 1:00, 3:00, 9:00, 10:00} 68 film-hours: { Fri.: 1:00, 3:00, 7:00 Sat., sun., mon., tue., wed., thu.: 10:00am, 1:00, 3:00, 9 :00, 10:00} ---------------------------------------------------------- == true Now I include my parse rules if I want to let those interested understand the way I did. (for convenience I also attach them to this msg.) You'll notice the many PRINTs to help me navigate in parallel with parse. rules2: make object! [ expr: [copy film-hours film-hours-rules (print ["film-hours: " mold film-hours newline "----------------------------------------------------------" newline]) to end ] film-hours-rules: [copy which-days days-group (print ["which-days: " mold which-days length? which-days]) any [copy which-days2 days-group (print ["which-days2: " mold which-days2 length? which-days2]) ] ] days-group: [copy which-day day (print ["which-day: " mold which-day length? which-day]) any ["," copy which-day2 day (print ["which-day2: " mold which-day2 length? which-day2]) ] ":" copy which-hour show-hour (print ["which-hour: " mold which-hour length? which-hour]) 0 1 "am" any ["," copy which-hour2 show-hour (print ["which-hour2: " mold which-hour2 length? which-hour2]) 0 1 "am" ] ] digit: charset [#"0" - #"9"] hour: [digit 0 1 digit] minutes: [digit digit] show-hour: [copy this-hour hour (print ["Hour:" mold this-hour length? this-hour]) ":" copy this-min minutes (print ["Min:" mold this-min length? this-min])] english-day: ["Fri." |"Sat." |"Sun." |"Mon." |"Tue." |"Wed." |"Thu." |"Every day"] french-day: ["Ven." |"Sam." |"Dim." |"Lun." |"Mar." |"Mer." |"Jeu." |"Tous les jours"] day: ["Fri." |"Sat." |"Sun." |"Mon." |"Tue." |"Wed." |"Thu." |"Every day"] ] Now my problem is stated as this: When I submit a broken (newline) set of data in the form of a new t4 as follows, my rules no more work: t4: { Fri.: 1:00, 3:00, 7:00 Sat., sun., mon., tue., wed., thu.: 10:00am, 1:00, 3:00, 9:00, 10:00} The new results are now more like this:
>> parse t4 rules2/expr
which-day: "Fri." 4 Hour: "1" 1 Min: "00" 2 which-hour: " 1:00" 5 Hour: "3" 1 Min: "00" 2 which-hour2: " 3:00" 5 Hour: "7" 1 Min: "00" 2 which-hour2: " 7:00" 5 which-days: "Fri.: 1:00, 3:00, 7:00" 22 film-hours: " Fri.: 1:00, 3:00, 7:00" ---------------------------------------------------------- == true The second part of results have been chopped. Later this chopped part mixed with the next title film when I complete my rules to get the title after the last presentation time Any help is appreciated. Regards, Gerard -- Binary/unsupported file stripped by Ecartis -- -- Type: text/x-rebol -- File: parse-film-times.r