[REBOL] REBOL Newbie tries to convert C source to REBOL (long posting)
From: WisD00M::gmx::net at: 20-Nov-2003 22:25
Hi everybody !
I am currently trying to port an ANSI C application over to REBOL.
I am pretty much a newbie to REBOL - and was pretty impressed by
its capabilities.
My problem now is, porting looked a lot easier than it actually turns out to
be for me.
I seem to lack some fundamental things in REBOL - maybe I just think too
one-dimensional
because of my ASM/C - background, indeed, in C I do parse the whole data
BYTE by BYTE - something that
doesn't seem effective with REBOL - as there seem to be more efficient
methods available,
as I understand it.
Actually, I would think the application should be fairly easy to
be implemented with REBOL - it's bascially only 'simple' string processing -
not much more. That's why I thought I should give it a try - also to be able
to compare REBOL performance with my compiled C-application.
basically, this is what my C source does:
1) open a data file (approx. 1 MB data)
filename: "test.dat"
data: read filename
2) parse the file for sections - every section is indicated by square
brackets
and is valid to the next (new) section
like this: (example data)
--------------------------------------------
[sec1]
dataline1 111.11 N012.11.029 E034.31.110
dataline2 131.11 N012.11.099 E034.31.110
dataline3 111.11 N015.11.099 E034.31.110
[sec2]
datalinex HFD 111.11 N012.11.099 E034.31.114
dataliney LKA 131.41 N011.11.049 E031.31.116
datalinez JIH 111.11 N012.11.019 E032.31.114
--------------------------------------------
So each sections contains several hundreds of lines with data,
the data syntax is different with each section. That's why I initially
separate the data only by whitespaces in C and take care of the exact syntax
at
a later time, using parsing rules specified in another array.
3) In C I would now simply determine each section's offsets within the
memory
and then store (memcpy) each section into its own multi-dimensional array
using the offsets
I determined in the previous loop, where each line of data is directly
accessible
- but also every part of the data line (separated by whitespaces).
At first I would go and compute every section's beginning like this:
________________________________________
sections: make block! 40
parse data [
any [to "[" copy sec indx: thru "]"
(append sections index? indx)
{/*
For debugging purposes I also emit the name of each section
that's found.
*/}
(print sec)
]
]
________________________________________
Then I got the position of each section in the data-series and compute
the offset of each section by calculating the difference between 2 adjacent
sections.
Accordingly, in C I would then go to extract/copy the data within the two
offsets
and put it all in another array to work with.
That's exactly where my problems start to overwhelm me.
While reading the file in, and parsing it for sections does seem to
work fairly well (and with MUCH less code !), I am having difficulties to
create the
multi-dimensional arrays for each section where all data is individually
accessible stored.
So as an example, having parsed [sec1] I want every element in each line to
be individually
accessible to simplify conversion.
4) Although the arrays don't work yet I tried to port the parsing routines
over -
In C I take the "parent" array that contains all sub-arrays and parse each
section individually. Parsing is done in C with a certain set of rules (also
stored in an array)
- using regular expressions, I 'expect' a certain data format for particular
sections.
If a rule is matched I parse with that rule and separate the data into
sub-arrays.
In REBOL the latter seems fairly easy using the parse command:
5) when all data has been processed and separated by its corresponding rule
I need to
convert the data for each section into a different format.
With C I am again using regular expressions to implement conversion. That's
why I
stored the recognition-rule and the conversion-rule in the same array
6) Final conversion of the original data will be to CSV-format - including
most data that was
read, but occasionally not all data is needed or an abbreviated form is
sufficient.
Maybe you guys got some thoughts on my problem - I don't want/need actual
code - rather some
hints how to accomplish my goals in REBOL. As you may be able to tell from
the way I describe
the problem, I might indeed be too heavily thinking in C. By the way: are
you aware of any
tutorials or books, particularly targeted at C-programmers. This seems
really to be a situation
where previous programming-experience limits my way of thinking. I did read
a lot of stuff on
the REBOL webpages, but still do have problems, grasping the inner concepts
within in REBOL.
Another question that just came to my mind: is there any decent REBOL code
editor available?
Simple syntax highlighting would be one thing - but I am thinking more of a
supportive
editor that also supports syntax completion or smart tooltips, helping a
newbie like me.
I guess, the power of REBOL is also some kind of problem for such an editor,
as single
objects/words can mean different things and dialects can easily be extended.
Maybe some really good REBOL-programmer should go and program an editor
for REBOL IN REBOL (view) ;-)
I'd really love to see a REBOL-editor being implemented in REBOL - it could
easily be
enhanced by everybody, there might even be some kind of "plugin" concept
considered -
using rebol scripts, that are executed on demand.
Using REBOL/VIEW it would also be available on pretty much all platforms.
Thanks for any help and comments - and sorry for this rather long eMail ;-)
P.S.: Is there any kind of REBOL-specific FORUM available on the web ? If
not: why not ? I would be willing to
create one - if there's demand.
--
regards
_________________
---------
Mike
GMX Weihnachts-Special: Seychellen-Traumreise zu gewinnen!
Rentier entlaufen. Finden Sie Rudolph! Als Belohnung winken
tolle Preise. http://www.gmx.net/de/cgi/specialmail/
+++ GMX - die erste Adresse für Mail, Message, More! +++