Mailing List Archive: Re: possible data format ... Re: Re: dbms3.r 01

[REBOL] Re: possible data format ... Re: Re: dbms3.r 01

From: rgaither:triad:rr at: 15-Jan-2002 11:31


Hi Joel,

>Nice summaries!

Thanks! :-)

>> This would be nice but I'm not sure it is reasonable for records
>> with a large text block.  One of the things I'd like to avoid is
>> imposing limits on record size so something like a webpage could
>> be a record if desired.
>>
>
>I think the underlying issue is which data types (in the generic
>sense of the word) should be supported.  Many databases with which
>I'm familiar treat "running text" (with large size potential) as a
>different type than "character data field" (with some upper limit
>on capacity).  To avoid dependence on filesystem issues (assuming
>we DO want to stay away from entanglement with any specific OS
>features/limits) one strategy is to store string data up to a
>certain size within the record; above that size, each such value
>would be kept in a separate file, with the name of the file as a
>string within the original record.  Certainly not the fastest
>option for some purposes, but it does allow such things as searches
>on non-running-text fields (e.g. keys) to be done without the
>overhead of reading/skipping the big chunks.

A good point.  I might lean towards a single file for each kind of
BLOB (or should that be TLOB :-)).  I still have a strong aversion to
creating lots of files to represent the DB.  Or perhaps even better
is the option to have both - just a file reference or a large text
block collection field type as well.

>The same approach can be used for BLOb data.

Yes indeed.

>Perhaps I should have been more explicit.  I assumed the existence

No, I got those parts as valid assumptions.

>The "append all stuff" strategy essentially puts the log (audit trail)
>within the file itself.  I believe it's faster to do that with one
>file (scanned as needed when un-packed) than with a separate main
>and transaction files.

Part of my reason for the separate files was to vary the format as
needed.  Also it allows the main file to be read only and assumed
static so it would not have to be altered if a transaction was not
applied.  Don't know about the speed impacts though, I believe there
are lots of issues depending on how the db is used.

>> All of these issues show the trade offs in the different operations
>> you need to do on the database.  Read and search performance
>> versus update and write operations and so on.  They conflict quite
>> a bit with the organization I would pick to keep the data in a single
>> file with visually "nice" organization. :-(
>>
>
>Simplicity imposes limits.  I don't know how to define it except
>with respect to intended uses.

That is the problem.  The one size fits all kind of solution is very
hard to find, perhaps impossible if you want a "Good" fit.  :-)

I keep comming back to wanting this to be binary. :-)

[snip good simple examples]

I do like Pekr's simple example as well and would only want to
include some table and column name information.  Sort of a block
oriented REBOL version of a CSV file.  I can live with the one line
record approach and related storage for the big text or binary objects
for the benefits it returns.

>I have a strong motivation to make it *possible* for humans (e.g.,
>me!) to read my data files since that's often useful for debugging
>and troubleshooting.  However most of the access is done by programs,
>so I tend to make it "just enough" human readable and prefer to ease
>the parsing burden on the program.

I also like to have the format simple enough to read and even
create manually if possible.  I know this is implying a limit on the
size and style that does not match Gabriele's requirements though
so some compromise is needed.

Thanks, Rod.

Rod Gaither
Oak Ridge, NC - USA
[rgaither--triad--rr--com]