Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: rebol-framework: information

From: robert::muench::robertmuench::de at: 31-Dec-2002 11:39

> -----Original Message----- > From: [rebol-bounce--rebol--com] [mailto:[rebol-bounce--rebol--com]] > On Behalf Of Ashley Truter > Sent: Tuesday, December 31, 2002 7:17 AM > To: [rebol-list--rebol--com] > Subject: [REBOL] Re: rebol-framework: information > Read your doco and the conceptual basis looks pretty sound to > me.
Hi Ashley, thanks a lot. This is all still a very fast moving target as new ideas pop up daily etc. So still a lot of opportunities to make it better :-))
> The "problem" I am attempting to solve is that the company I > am trying to run receives and generates a lot of data in multiple > formats (many of them not electronic). The formats include: > digital documents (doc, pdf, etc) > digital images (jpg, bmp, png, gif, tiff, etc) > physical (books, photos, brochures, sales lists, invoices, etc) > music (CD, tape, MP3, etc) > video (DVD, VHS, AVI, etc) > > located and/or referenced in different "locations": > email client > browser (bookmarks and history) > website(s) > hard-drive > removable media > filing cabinet > manilla folders > notepad / sticky note
Oh yes, I know this (and I don't like this information fragmention, it's evil in daily work).
> which then depend upon disparate means of archival and retrieval: > electronic file system > databases > sticky notes > filing cabinets > manilla folders > someone's memory > > When we look at a source of data, eg a website or PDF, we > spend some effort to: > categorise what we are looking at > understand the content > remember where and when we found it > > Some of this "manual" processing may be recorded (in > disparate places), most often it is committed to our > (fallible) memory!
That's the problem of information overload and much nastier: The problem that different people have a different understanding/interpretation of things. So IMO the effort to try to agree/implement a common, from all accepted concept of handling such things will never work! It can't work, because we think different... What looks logical to me, could be far from what you would understand to be logic.
> I am trying to design a "system" that enables either the data > itself (say in the case of "contacts") or a reference to the data (say > in the case of PDF documents) to be stored without having to predefine > an arbitrary data structure.
:-)) So far this sounds good...
> The "system" should allow me to do things like: > 1) show me the location of all data to do with embedded > systems, sorted by what I have most recently looked at. > [This should display an entry for each relevant source. > Clicking on the entry should take me to the source > (opening a PDF, logging onto a website, telling me the location of a > physical folder, etc)]
Ok, I already have started to implement a "file!" datatype to be used for BOT. This will do some of the things you mentioned. BTW: I have exactly the same problem to solve ;-))
> 2) generate a phone list based on my contacts
That's pretty easy and could be done today with RFM. I'm using it at the moment to track my time and generate a report from this.
> 3) tell me which CD's contain optometric images in JPG format
Ok.
> 4) Display my account details, including encrypted username / > passwords / PINs
No problem too.
> 5) Tell me if the PDF on my HDD has been updated on the site > I originally obtained it from, and if so, get it.
This is more a file/directory differ. But could be done by building up a catalog entry with MD5 checksums.
> My attitude to data capture is that it should be as easy as > writing it on a piece of paper, because if it isn't then that's > where it'll end up!
Yes, totally agreed! That's what I want to have a GUI that is not fancy etc. It just should display me a simple data-form and support me with default values, good tab-sequence etc. as much as possible. The GUI should look the same as much as possible.
> The system should also infer as much as it can about the data.
;-)
> With that in mind I am trying to use a heuristic approach such that > entering: > > Mr James Smith > 28 Oakfield Street > (03) 9876 1234 BH > > will store this record as key/value pairs (with each line > being a value), but able to infer things about compound values > (eg. while "Mr James Smith" is a name, the system should be able to > work out gender, firstname and surname based on the value).
I see. I would do this a bit different. I would store the broken down data and have a heuristic function that will split the input BEFORE saving it. So this looks to me like an intelligen input-form.
> In essence, the system would extend REBOL's knowledge of data types > such that it knows what a value is and what it can > do with that value.
I don't know if we have to extend Rebol datasystem. Rebol is the base where we can build on. For example I have implemented a date! Datatype for RFM that has some intelligence like you mentioned. At the moment this is burried in the engine code. But this doesn't make sense... So I will shift this code into a customizeable datastructure that gets used at runtime in all needed places.
> Depending upon the source of data, other attributes can be > automatically deduced and stored. For example, a PDF file has a file > name, location, create and modify dates, and with a little extra work > the title, author and description (if any) can be extracted from the > document itself.
Yes, that's what I would expect to be added by the people. To stay with your example: I have this filecollection datatype. The user picks some files, next the filecollection BOT has some code to analyze the selected files. For some (like PDF) it knows how to deduce additional information. Well, than it will instantiate a BOT for PDF-files, fill in what's possible and link this new data-record to itself. Voila!
> This approach, if achievable, does away with data structures > and allows freeform queries to generate tabulated result sets (eg. > A "Show all invoice#, amounts" would know that amount is right > justified and prompt to include a total summary).
This would be something I want to include into a QUERY and REPORT dialect. Don't know how this looks like yet but I hope not like SQL ;-))
> Anyway, I hope something in the above is of value [to someone] ;)
Thanks a lot! Some of your ideas are what I want to achive with RFM, some are new and can/should be do able. I want to create a extendable, plug-in based system that can be extended very simple. Robert