choice of representation - was paths & lookups & change

[1/5] from: brett::codeconscious::com at: 24-Oct-2003 17:42

>From Elan: > My preference for this kind of tasks is using objects. Even though it is > a little more verbose, I find it quite intuitive to use the get and set > functions, as in:

Choice of an "information model" and representation is something I've struggled/ruminated over for a while. I guess I am spoilt for choice. For some applications I think a distinction can be made for informational/database representations depending whether they are out-of-process in a serialised form (on disk / on the wire) or in-process in a tree or other structure suitable for fast evaluations. I habitually try to find a single REBOL representation that solves both needs well, but after I while I wonder whether I should really be considering multiple representations of the same information for different needs. My latest feeling (I certainly have no firm conclusions) is that when I need the information stored on disk, a straight loadable block format (complex dialect or simple sequence) is better for saving in a textual form. Where I need a representation to faciliate in-process evaluations or I need to affect the evaluation itself (e.g bind), objects can be a good candidate. Bridging the two, if necessary, a dialect. As I said, no firm conclusions, just more ruminations. I'm interested to know what other people think about their preference of representation (e.g Robert's preference for blocks, Elan's for objects) when you consider the in-process / out-of-process distinction. Or do people feel that the distinction itself is not useful? I'm sort of assuming here that we're discussing a REBOL only storage and processing application. Another interesting line of discussion would be on useful REBOL idioms when programming a REBOL app to talk SQL. Regards, Brett

[2/5] from: atruter:labyrinth:au at: 24-Oct-2003 19:23

Hi Brett, a topic near and dear to me ;) For me, I find the model I choose is impacted by: - data structure - volume - time [development] - performance requirements - whether the data is static / dynamic For key / value pairs I might use: states: ["VIC" "Victoria" "NSW" "New South Wales"] select states "NSW" or states: [VIC "Victoria" NSW "New South Wales"] states/NSW The first form being usefull in conjunction with parse, eg. ini: parse read %settings.ini "=" For large key / value sets I might use hash! instead of block! If the data takes the form of key/values I tend to use composed objects such as: attributes: context do has [spec][ spec: make block! 1024 foreach [attribute t w] load %Database/Attributes.dat [ insert tail spec compose/only/deep [ (to-set-word attribute) context [type: (t) width: (w) used: 0] ] ] spec ] For large amounts of data (100,000+ records), I try to express / represent the data as tables of columns and rows (I'm an RDBMS guy after all ;) ). I also try to store as much meta-data in the file-system as I can, for example, I may have a file like the following: /c/rSQL/Contacts/Name.s24 which tells me that the "Contacts" table has a "string" column of 24 characters width named "Name".

> Another interesting line of discussion would be on > useful REBOL idioms when programming a REBOL app to talk SQL.

Understatement! Having spent the last year [on and off] designing a REBOL RDBMS, I can assure you that there is more to it than will fit in a quick email reply (I am writing my "thesis" [ie. documentation] on my REBOL RDBMS design as we speak). ;) An interesting question that this raises: Is it better to have a SQL parser or a SQL dialect? Regards, Ashley

[3/5] from: maximo:meteorstudios at: 24-Oct-2003 9:32

I often use objects and store them on disk as code using mold. With rebol writing text files is so easy. but for larger data, I usually then go with blocks. using switch like so: data: [name ["snoopy"] age [30] race ["beagle"]] name: switch 'name data == "snoopy" I know many use blocks of the type: data: [[value data] [value data]] but then access time is much slower, as you have to scan the whole list in your code instead of letting fast 'switch or 'select functions take care of it. why use switch rather than select? I can use code when I need it, directly in my values. default-value: "n/a" data: [name [default-value] age [30] race ["beagle"]] name: switch 'name data == "n/a" also, the fact that values are within blocks makes the use of words as data possible. -MAx --- You can either be part of the problem or part of the solution, but in the end, being part of the problem is much more fun.

[4/5] from: brett:codeconscious at: 25-Oct-2003 11:13

Hi Ashley, Interesting samples and comments.

> I also try to store as much meta-data in the file-system as I can, for > example, I may have a file like the following: > > /c/rSQL/Contacts/Name.s24 > > which tells me that the "Contacts" table has a "string" column of 24 > characters width named "Name".

That's an innovative convention! Wasn't clear to me though what that particular file actually stored - all rows of a single column of data in a fixed width form?

> > Another interesting line of discussion would be on > > useful REBOL idioms when programming a REBOL app to talk SQL.

<<quoted lines omitted: 3>>

> RDBMS design as we speak). ;) An interesting question that this raises: Is > it better to have a SQL parser or a SQL dialect?

If my comment was understatement, your question is the tip of an iceberg! :-) 1) Assuming the SQL parser vs SQL dialect question is relating to your RDBMS, A SQL parser for your RDBMS would: *Be great for fast take up by people who already know SQL - could be an very important factor. *Mean your RDBMS is swappable with other RDBMSs. *The cost is the loss of specific REBOL integration with the database, because a "semantic barrier" has been errected (namely SQL) for good and bad. A SQL Dialect for your RDBMS would: *Allow REBOL database client scripts to embed, generate, load, mainpulate queries as REBOL values instead of doing those operations with hard to manipulate strings. *Perhaps allow some specific REBOL integration not available in the SQL parser options - eg. predicates expressed as REBOL expressions perhaps. 2) If your question of SQL parser vs SQL dialect is separated from your RDBMS - reprhasing it like "Would it be more useful to have a SQL parser or a SQL dialect in REBOL?" then different ideas emerge: It might be useful to have both, making the SQL parser subordinate (a helper) to the SQL dialect. *A REBOL client program can use the SQL dialect to generate/load/manipulate SQL that will be sent to an external DB. Side note, the current REBOL command database interface has parameter markers to fill in bits of the SQL with REBOL values, but I was thinking of something more sophisticated. *The dialect could emit different SQL depending on different DB backends to take account of extensions and differences. *A higher order dialect might provide some value-add like transaction control or tracing. 3) Reconcile (1) and (2)? Create the SQL parser and SQL dialects as standalone, and a way to do translate backward and forwards between them. Allowing them to be used where they are needed. After all with your RDBMS, if you accept SQL you'll need to create a query plan at some point, one of the steps to do that could be a translation into a SQL dialect (a relational algebra?). It costs me nothing to throw out these ideas, actually building the stuff could be challenging :-) Regards, Brett

[5/5] from: brett:codeconscious at: 25-Oct-2003 11:26

Hi Max,

> why use switch rather than select? > I can use code when I need it, directly in my values.

<<quoted lines omitted: 3>>

> == "n/a" > also, the fact that values are within blocks makes the use of words as

data possible. Thats interesting, I don't think I would have considered switch for data access like that. Another tool for the tool-belt. :-) Regards, Brett.

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted