[REBOL] Re: rebol-framework: information
From: atruter:labyrinth:au at: 31-Dec-2002 17:16
Hi Robert,
Read your doco and the conceptual basis looks pretty sound to me. I've also
been working on an alternative to the traditional database approach and my
thoughts, or emphasis, differ from yours. Since your doco got me thinking
about other issues, I thought (hope!) I might return the favour by jotting
down my thoughts on the subject.
The "problem" I am attempting to solve is that the company I am trying to
run receives and generates a lot of data in multiple formats (many of them
not electronic). The formats include:
digital documents (doc, pdf, etc)
digital images (jpg, bmp, png, gif, tiff, etc)
physical (books, photos, brochures, sales lists, invoices, etc)
music (CD, tape, MP3, etc)
video (DVD, VHS, AVI, etc)
located and/or referenced in different "locations":
email client
browser (bookmarks and history)
website(s)
hard-drive
removable media
filing cabinet
manilla folders
notepad / sticky note
which then depend upon disparate means of archival and retrieval:
electronic file system
databases
sticky notes
filing cabinets
manilla folders
someone's memory
When we look at a source of data, eg a website or PDF, we spend some effort
to:
categorise what we are looking at
understand the content
remember where and when we found it
Some of this "manual" processing may be recorded (in disparate places),
most often it is committed to our (fallible) memory!
I am trying to design a "system" that enables either the data itself (say
in the case of "contacts") or a reference to the data (say in the case of
PDF documents) to be stored without having to predefine an arbitrary data
structure. The "system" should allow me to do things like:
1) show me the location of all data to do with embedded systems, sorted by
what I have most recently looked at. [This should display an entry for each
relevant source. Clicking on the entry should take me to the source
(opening a PDF, logging onto a website, telling me the location of a
physical folder, etc)]
2) generate a phone list based on my contacts
3) tell me which CD's contain optometric images in JPG format
4) Display my account details, including encrypted username / passwords /
PINs
5) Tell me if the PDF on my HDD has been updated on the site I originally
obtained it from, and if so, get it.
My attitude to data capture is that it should be as easy as writing it on a
piece of paper, because if it isn't then that's where it'll end up! The
system should also infer as much as it can about the data. With that in
mind I am trying to use a heuristic approach such that entering:
Mr James Smith
28 Oakfield Street
(03) 9876 1234 BH
will store this record as key/value pairs (with each line being a value),
but able to infer things about compound values (eg. while "Mr James Smith"
is a name, the system should be able to work out gender, firstname and
surname based on the value). In essence, the system would extend REBOL's
knowledge of data types such that it knows what a value is and what it can
do with that value. At this stage I'm thinking that REBOL can be used to
determine the base datatype (eg. money, email, string, etc) pretty well,
but the system will need to infer a bit more with some types (eg. is this
string actually a phone number?) and offer logical choices and defaults
once the datatype is determined (eg. You entered $12.50 which I know is
money! and will default to "Price", but it could be "Amount", "Tax",
Discount
, etc).
Depending upon the source of data, other attributes can be automatically
deduced and stored. For example, a PDF file has a file name, location,
create and modify dates, and with a little extra work the title, author and
description (if any) can be extracted from the document itself.
This approach, if achievable, does away with data structures and allows
freeform queries to generate tabulated result sets (eg. "Show all invoice#,
amounts" would know that amount is right justified and prompt to include a
total summary).
Anyway, I hope something in the above is of value [to someone] ;)
Regards,
Ashley