[REBOL] Data from mainframes
From: sunandadh::aol::com at: 16-Mar-2002 3:42
[joel--neely--fedex--com] writes:
> Since much data from mainframes, or obtained via OCR (from TSU/SSU
> sources, of course! ;-), is in fixed-record layout, it would be
> interesting to have a program that would attempt to infer "field"
> positions and lengths by examining the content of the data.
> Wouldn't REBOL be a good tool to write such a sort-of-AI task in?
Warning -- this post may be off-topic for anyone not interested in exchanging
data with a mainframe data center....Delete now!
Without denying the existence of Honeywell etc al, when I write "mainframe" I
mean an IBM Series/360 or later.
There's probably four sensible ways to get data from a mainframe:
-- The data center people could write you a nice extract program to put all
the data in a CSV or equivalent -- no problems with handling that. It just
might take them years to get round to it
-- They could simply give you, in machine readable-format, an existing report
-- maybe say a spooled copy of last night's invoice run. You could devise a
set of Rebol heuristics to make sense of the data. But you'd probably be
better off buying a copy of Monarch -- it's been doing that for ten years.
-- You could find a pile of old fan-fold paper with important data on it, OCR
it and then try to make sense of the resulting file. Lots of possible fun
code here as you mention. Though it might be better to send the scans to one
of those keyprep farms in the Far East where they'll rekey the stuff
overnight with six nines accuracy. Then put it through Monarch to extract the
data.
-- You might have been landed with the original native mainframe file. Some
nice rebol routines would be useful to convert this data -- but heuristics
are likely to fail. You'd really need the data layout spec. Us old
mainframers thought nothing of following a couple of fixed length text fields
with some16-bit binary fields, and a few 32 bit binary field, and finishing
up with some packed decimal fields. There's no way you could sensible parse
that without heavy hinting.
A fifth possibility would be to host Rebol on a mainframe and write your own
extract and download processes in Rebol. To be useful in this area (i.e. file
extraction) we'd need routines that can handle fixed-length binary and packed
decimal. That's not difficult. And an extension to handle reading and writing
partitioned datasets (like ZIP/sea/gz: loads of files inside another) would
be very useful.
Core ought to be an easy port as a native z/OS (nee MVS: a mainframe OS)
program if RT wanted to. View is more problematical for the same reasons IBM
has never claimed full Posix compliance for its various Unix implementations:
your terminal is not directly attached to the processor, so you can't get
very real-time interactive from the keyboard.
Given that mainframes now run GNU/Linux, a port of Core to that environment
should be even easier.
There's never been -- as far as I know -- a scripting language that has
successfully migrated from mainframe to micros or vice versa. Rexx got as far
as OS/2 and gave up. Selcopy has a Unix and PC port, but it's never looked
right there, and never thrived.
But, if the port works, 75-80% of all machine-readable data is in our grasp!
the Rebolution is complete!!
Sunanda.