Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: rebol-framework: information

From: atruter:labyrinth:au at: 31-Dec-2002 17:16

Hi Robert, Read your doco and the conceptual basis looks pretty sound to me. I've also been working on an alternative to the traditional database approach and my thoughts, or emphasis, differ from yours. Since your doco got me thinking about other issues, I thought (hope!) I might return the favour by jotting down my thoughts on the subject. The "problem" I am attempting to solve is that the company I am trying to run receives and generates a lot of data in multiple formats (many of them not electronic). The formats include: digital documents (doc, pdf, etc) digital images (jpg, bmp, png, gif, tiff, etc) physical (books, photos, brochures, sales lists, invoices, etc) music (CD, tape, MP3, etc) video (DVD, VHS, AVI, etc) located and/or referenced in different "locations": email client browser (bookmarks and history) website(s) hard-drive removable media filing cabinet manilla folders notepad / sticky note which then depend upon disparate means of archival and retrieval: electronic file system databases sticky notes filing cabinets manilla folders someone's memory When we look at a source of data, eg a website or PDF, we spend some effort to: categorise what we are looking at understand the content remember where and when we found it Some of this "manual" processing may be recorded (in disparate places), most often it is committed to our (fallible) memory! I am trying to design a "system" that enables either the data itself (say in the case of "contacts") or a reference to the data (say in the case of PDF documents) to be stored without having to predefine an arbitrary data structure. The "system" should allow me to do things like: 1) show me the location of all data to do with embedded systems, sorted by what I have most recently looked at. [This should display an entry for each relevant source. Clicking on the entry should take me to the source (opening a PDF, logging onto a website, telling me the location of a physical folder, etc)] 2) generate a phone list based on my contacts 3) tell me which CD's contain optometric images in JPG format 4) Display my account details, including encrypted username / passwords / PINs 5) Tell me if the PDF on my HDD has been updated on the site I originally obtained it from, and if so, get it. My attitude to data capture is that it should be as easy as writing it on a piece of paper, because if it isn't then that's where it'll end up! The system should also infer as much as it can about the data. With that in mind I am trying to use a heuristic approach such that entering: Mr James Smith 28 Oakfield Street (03) 9876 1234 BH will store this record as key/value pairs (with each line being a value), but able to infer things about compound values (eg. while "Mr James Smith" is a name, the system should be able to work out gender, firstname and surname based on the value). In essence, the system would extend REBOL's knowledge of data types such that it knows what a value is and what it can do with that value. At this stage I'm thinking that REBOL can be used to determine the base datatype (eg. money, email, string, etc) pretty well, but the system will need to infer a bit more with some types (eg. is this string actually a phone number?) and offer logical choices and defaults once the datatype is determined (eg. You entered $12.50 which I know is money! and will default to "Price", but it could be "Amount", "Tax", Discount , etc). Depending upon the source of data, other attributes can be automatically deduced and stored. For example, a PDF file has a file name, location, create and modify dates, and with a little extra work the title, author and description (if any) can be extracted from the document itself. This approach, if achievable, does away with data structures and allows freeform queries to generate tabulated result sets (eg. "Show all invoice#, amounts" would know that amount is right justified and prompt to include a total summary). Anyway, I hope something in the above is of value [to someone] ;) Regards, Ashley