Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: REBOL Newbie tries to convert C source to REBOL (long posting)

From: AJMartin:orcon at: 24-Dec-2003 22:49

Hi, Mike! Mike wrote:
> P.S.: Is there any kind of REBOL-specific FORUM available on the web ? If
not: why not ? I would be willing to create one - if there's demand. Not quite a forum, but a chat program, AltME, is available from: http://www.altme.com/ And look for the "Rebol" world.
> 1) open a data file (approx. 1 MB data) > filename: "test.dat" > data: read filename > 2) parse the file for sections - every section is indicated by square
brackets and is valid to the next (new) section
> like this: (example data)
-------------------------------------------- [sec1] dataline1 111.11 N012.11.029 E034.31.110 dataline2 131.11 N012.11.099 E034.31.110 dataline3 111.11 N015.11.099 E034.31.110 [sec2] datalinex HFD 111.11 N012.11.099 E034.31.114 dataliney LKA 131.41 N011.11.049 E031.31.116 datalinez JIH 111.11 N012.11.019 E032.31.114 -------------------------------------------- I looked at the data above and noticed that it's not directly 'load-able with Rebol as these values "N012.11.029" will get turned into Rebol words. So the next plan is to use 'parse. The basic application of 'parse, where whitespace is important is: parse/all data rules Now to work out the rules that are needed. I can see that there are several sections in the line which seem to be terminated by newlines. So: rules: [ some section_rule end ] I can see that each section starts with a open square bracket, then there's a section name (which seems important) followed by a closing square bracket, (perhaps optional whitespace?) and a newline. After that, there's any number of data lines, which seem to form a table of values of various types, with perhaps a trailing empty line? section_rule: [ #"[" copy Section_Name to #"]" skip any #" " newline ; The 'skip steps over the "unconsumed" "]". any [ data_line_rule ] newline ] data_line_rule: [ any value_item_rule any #" " newline ]
> I need to convert the data for each section into a different format.
value_item_rule: [ copy Item to #" " some #" " ] The above rule pick out each item between space characters. So you just need to parse each item on each line in each section and convert to appropriate Rebol data-type. My %Patterns.r script file could be useful at this point. I'd use it something like this: value_item_rule: [ copy Item to #" " some #" " ( if Item [ Item: if parse/all Item [ time^ end (Item: to time! Item) | money^ end (Item: to money! Item) | integer^ end (Item: to integer! Item) ; and so on for the expected data types. ] [ Item ] ] insert tail Data_Line Item ) ]
> 6) Final conversion of the original data will be to CSV-format - including
most data that was read, but occasionally not all data is needed or an abbreviated form is sufficient. This part is a bit trickier; some of my scripts in my %Values.r can be very helpful here. I'd be putting each section into it's own block, with each line as separate blocks within the surrounding block. That way it can be more easily torn apart, twisted around and put back together. (I do a lot of this at the school I work at.) Here's the first block, just loaded straight into Rebol:
>> sec1: [
[ [dataline1 111.11 N012.11.029 E034.31.110] [ [dataline2 131.11 N012.11.099 E034.31.110] [ [dataline3 111.11 N015.11.099 E034.31.110] [ ] == [ [dataline1 111.11 N012.11.029 E034.31.110] [dataline2 131.11 N012.11.099 E034.31.110] [dataline3 111.11 N015.11.... Then using my 'transpose to give the array a 90 degree twist:
>> probe sec1_columns: transpose sec1
[[dataline1 dataline2 dataline3] [111.11 131.11 111.11] [N012.11.029 N012.11.099 N015.11.099] [E034.31.110 E034.3 1.110 E034.31.110]] == [[dataline1 dataline2 dataline3] [111.11 131.11 111.11] [N012.11.029 N012.11.099 N015.11.099] [E034.31.110 E03 4.31.110 E034.31.1... Then swapping columns arround, so the last column is first and the first column is last:
>> probe t: transpose reduce [last sec1_columns second sec1_columns third
sec1_columns first sec1_columns] [[E034.31.110 111.11 N012.11.029 dataline1] [E034.31.110 131.11 N012.11.099 dataline2] [E034.31.110 111.11 N015.1 1.099 dataline3]] == [[E034.31.110 111.11 N012.11.029 dataline1] [E034.31.110 131.11 N012.11.099 dataline2] [E034.31.110 111.11 N01 5.11.099 dataline3... Then converting to CSV format (using my %CSV.r script):
>> probe mold-csv t
{E034.31.110,111.11,N012.11.029,dataline1 E034.31.110,131.11,N012.11.099,dataline2 E034.31.110,111.11,N015.11.099,dataline3 } == {E034.31.110,111.11,N012.11.029,dataline1 E034.31.110,131.11,N012.11.099,dataline2 E034.31.110,111.11,N015.11.099,dataline3 } I hope that helps! Andrew J Martin Speaking in tongues and performing miracles. ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/