[REBOL] Re: REBOL Newbie tries to convert C source to REBOL (long posting)
From: tomc:darkwing:uoregon at: 20-Nov-2003 21:20
response on bottom
On Thu, 20 Nov 2003, Mike Loolard wrote:
> Hi everybody !
>
> I am currently trying to port an ANSI C application over to REBOL.
> I am pretty much a newbie to REBOL - and was pretty impressed by
> its capabilities.
>
> My problem now is, porting looked a lot easier than it actually turns out to
> be for me.
>
> I seem to lack some fundamental things in REBOL - maybe I just think too
> "one-dimensional"
> because of my ASM/C - background, indeed, in C I do parse the whole data
> BYTE by BYTE - something that
> doesn't seem effective with REBOL - as there seem to be more efficient
> methods available,
> as I understand it.
>
> Actually, I would think the application should be fairly easy to
> be implemented with REBOL - it's bascially only 'simple' string processing -
>
> not much more. That's why I thought I should give it a try - also to be able
> to compare REBOL performance with my compiled C-application.
>
> basically, this is what my C source does:
>
> 1) open a data file (approx. 1 MB data)
>
> filename: "test.dat"
> data: read filename
>
> 2) parse the file for sections - every section is indicated by square
> brackets
> and is valid to the next (new) section
>
> like this: (example data)
> --------------------------------------------
> [sec1]
> dataline1 111.11 N012.11.029 E034.31.110
> dataline2 131.11 N012.11.099 E034.31.110
> dataline3 111.11 N015.11.099 E034.31.110
>
> [sec2]
> datalinex HFD 111.11 N012.11.099 E034.31.114
> dataliney LKA 131.41 N011.11.049 E031.31.116
> datalinez JIH 111.11 N012.11.019 E032.31.114
> --------------------------------------------
> So each sections contains several hundreds of lines with data,
> the data syntax is different with each section. That's why I initially
> separate the data only by whitespaces in C and take care of the exact syntax
> at
> a later time, using parsing rules specified in another array.
>
> 3) In C I would now simply determine each section's offsets within the
> memory
> and then store (memcpy) each section into its own multi-dimensional array
> using the offsets
> I determined in the previous loop, where each line of data is directly
> accessible
> - but also every part of the data line (separated by whitespaces).
>
> At first I would go and compute every section's beginning like this:
> ________________________________________
> sections: make block! 40
>
> parse data [
> any [to "[" copy sec indx: thru "]"
> (append sections index? indx)
> {/*
> For debugging purposes I also emit the name of each section
> that's found.
> */}
> (print sec)
> ]
> ]
> ________________________________________
>
> Then I got the position of each section in the data-series and compute
> the offset of each section by calculating the difference between 2 adjacent
> sections.
> Accordingly, in C I would then go to extract/copy the data within the two
> offsets
> and put it all in another array to work with.
>
> That's exactly where my problems start to overwhelm me.
> While reading the file in, and parsing it for sections does seem to
> work fairly well (and with MUCH less code !), I am having difficulties to
> create the
> multi-dimensional arrays for each section where all data is individually
> accessible stored.
>
> So as an example, having parsed [sec1] I want every element in each line to
> be individually
> accessible to simplify conversion.
>
> 4) Although the arrays don't work yet I tried to port the parsing routines
> over -
> In C I take the "parent" array that contains all sub-arrays and parse each
> section individually. Parsing is done in C with a certain set of rules (also
> stored in an array)
> - using regular expressions, I 'expect' a certain data format for particular
> sections.
> If a rule is matched I parse with that rule and separate the data into
> sub-arrays.
>
> In REBOL the latter seems fairly easy using the parse command:
>
> 5) when all data has been processed and separated by its corresponding rule
> I need to
> convert the data for each section into a different format.
> With C I am again using regular expressions to implement conversion. That's
> why I
> stored the recognition-rule and the conversion-rule in the same array
>
> 6) Final conversion of the original data will be to CSV-format - including
> most data that was
> read, but occasionally not all data is needed or an abbreviated form is
> sufficient.
>
> Maybe you guys got some thoughts on my problem - I don't want/need actual
> code - rather some
> hints how to accomplish my goals in REBOL. As you may be able to tell from
> the way I describe
> the problem, I might indeed be too heavily thinking in C. By the way: are
> you aware of any
> tutorials or books, particularly targeted at C-programmers. This seems
> really to be a situation
> where previous programming-experience limits my way of thinking. I did read
> a lot of stuff on
> the REBOL webpages, but still do have problems, grasping the inner concepts
> within in REBOL.
>
> Another question that just came to my mind: is there any decent REBOL code
> editor available?
> Simple syntax highlighting would be one thing - but I am thinking more of a
> supportive
> editor that also supports syntax completion or smart tooltips, helping a
> newbie like me.
>
> I guess, the power of REBOL is also some kind of problem for such an editor,
> as single
> objects/words can mean different things and dialects can easily be extended.
> Maybe some really good REBOL-programmer should go and program an editor
> for REBOL IN REBOL (view) ;-)
>
> I'd really love to see a REBOL-editor being implemented in REBOL - it could
> easily be
> enhanced by everybody, there might even be some kind of "plugin" concept
> considered -
> using rebol scripts, that are executed on demand.
> Using REBOL/VIEW it would also be available on pretty much all platforms.
>
> Thanks for any help and comments - and sorry for this rather long eMail ;-)
>
> P.S.: Is there any kind of REBOL-specific FORUM available on the web ? If
> not: why not ? I would be willing to
> create one - if there's demand.
>
> --
> regards
>
> _________________
> ---------
> Mike
>
> GMX Weihnachts-Special: Seychellen-Traumreise zu gewinnen!
>
> Rentier entlaufen. Finden Sie Rudolph! Als Belohnung winken
> tolle Preise. http://www.gmx.net/de/cgi/specialmail/
>
> +++ GMX - die erste Adresse für Mail, Message, More! +++
>
> --
> To unsubscribe from this list, just send an email to
> [rebol-request--rebol--com] with unsubscribe as the subject.
>
hopefuly this will get your mind moving slightly differently
data: {
[sec1]
dataline1 111.11 N012.11.029 E034.31.110
dataline2 131.11 N012.11.099 E034.31.110
dataline3 111.11 N015.11.099 E034.31.110
[sec2]
datalinex HFD 111.11 N012.11.099 E034.31.114
dataliney LKA 131.41 N011.11.049 E031.31.116
datalinez JIH 111.11 N012.11.019 E032.31.114
}
sections: make block! 40
row-data: complement charset "[^/" ; accept any char not \[ nor \n
row: [copy r some row-data newline ; break section into rows by \n
(insert/only tail blk parse r none) ; break row into fields by ws
]
parse data [
some [
to "["
copy sec [thru "]"] (print sec)
thru newline
; well you are all ready here so...
; instead of storing the location and all that ...
; just deal with it now.
(insert/only tail sections blk: make block! 500)
any row
]
]
; to see what native types rebol would see in each field
foreach sec sections[
foreach row sec[
foreach field row[
prin [type? load field field " " ]
]
print ""
]
]
;probe sections