r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!RebDB] REBOL Pseudo-Relational Database

Graham
9-Feb-2006
[51]
oohh.. looks like a Cobalt Raq.
Ashley
9-Feb-2006
[52]
Back to Pekr's JOIN problem. First, let's reformat the SQL into something 
more readable:


select o.order-id, o.date, o.amount, o.total, oi.item-id, io.item-price, 
oi.goods-name, c.company-name
from   orders o
,      order-items oi
,      companies c
where  o.order.id = oi.order-id
  and  o.company-id = c.company-id
  and  o.amount > 2000
order by c.company-name, o.order-id, o-item-id
Graham
9-Feb-2006
[53]
not here either http://www.colellachiara.com/soft/
Ashley
9-Feb-2006
[54x2]
Then break it down into discrete queries and wrap it in some loops:

blk: copy []

foreach [company-id order-id date amount total] sql [

 select [company-id order-id date amount total] from orders where 
 [amount > 2000]
][
	company-name: second sql compose [lookup companies (company-id)]

 foreach [item-id item-price goods-name] sql compose [select * from 
 order-items where (order-id)] [

  insert tail blk reduce [order-id date amount total item-id item-price 
  goods-name company-name]
	]
]

sort/skip/compare blk 8 [8 1 5]
The final [untested] solution is about as efficient as you can get. 
Adding JOIN support to RebDB so it can break the query down into 
similar steps is not a simple task. In all but the most trivial of 
cases you'd be better off coding it yourself.


What might be a good idea is to add something that lets you more 
easily specify the most common JOIN operation - master/detail with 
optional LOV (List Of Values) lookup(s). Your query is a classic 
example of this construct and it accounts for a surprisingly large 
number of queries.


The function would accept two queries, a master query (the orders 
table in your case) and a details query (the order-items table) and 
an optional block of column/LOV-table pairs (that perform substitutions 
such as company-name). The skeleton would look like:


 sql-join [master-query [block!] detail-query [block!] /order /lov 
 [block!] ]  [
		buffer: copy []
		...
		buffer
	]

and would be used as such:

	sql-join/order/lov [

  select [company-id order-id date amount total] from orders where 
  [amount > 2000]
	] [
		select * from order-items where %ID%
	] [8 1 5] [company-id companies]

Would this make things a tad easier?
Pekr
9-Feb-2006
[56x5]
not sure your sql query rewrite was much clearer, but - your last 
post makes so much sense!
yes, basically it is typical two cases - master-detail, and lookup!!!
IIRC MySQL docs join section describes very nicely the technique 
of optimisation. And after reading it some few monts ago I agree, 
that 'join is not trivial task at all, basically due to get it optimised. 
But you are right that most of the time those two mentioned cases 
are needed. The thing which complicates the optimisation part is 
- is your column a key? primary key? is it at least indexed? etc.
anyway .... maybe a good question for RT, when finally RIF arrives, 
because if I understand you correctly, till then, RebDB will be memory 
only database (although open/seek is working already)
Graham - I noticed you used Firebird for your medical system - is 
there kind of embedded version as with sqlite?
Coccinelle
9-Feb-2006
[61]
Ashley, sql-protocol generates the code used to extract and join 
the data is grouped in 2 functions :
- make-do-select
- make-do-loop

They receive the column list, the table list (a block of table/alias 
pair), the where code to apply and the database (the port).


If you provide these parameters and change the code to invoke RebDB 
fuction to get the data, you will have a basic join implementation 
for RebDB. You can use it and extend it for RebDB, if you want, I 
will be happy if you do so.
Gabriele
9-Feb-2006
[62x2]
ashley: sent latest dbms3.r via email.
ashley: in your example you are doing a query for each record returned 
in the first query, depending on the overhead of doing a query this 
might be a lot slower than a support for ioins in the dbms itself, 
even without any optimization (my code does not do any optimization, 
the programmer has to do optimization manually)
Ashley
11-Feb-2006
[64]
Thanks guys, I've had a good look at both implementations and I've 
got ideas from both for a future full JOIN implementation; but at 
the moment my master/detail code has come along nicely. I've now 
enhanced the db-select function to accept statements in these additional 
forms:


 select * from master joins [select * from details where &id] on id

 select * from master joins [select * from details where [all [master-id 
 = &id master-date = &date]] on [id date]


which works exactly like a normal join with the following differences:

	a) It can only join one table to another

 b) Detail columns are always joined to the right of master columns

 c) Table.column prefixes are not supported so all columns in the 
 join must be uniquely named


Apart from that you get all the benefits of db-select (can replace 
* with specific column combinations, order and group by on the final 
result set, etc) *and* it's significantly faster than even the raw 
REBOL code example I gave before (as the SQL is parsed once within 
db-select and all loop sub-selects are done in-line).

I've also implemented “lookups” with the following form:

	select * from table replaces id with name
	select * from table replaces [id-1 id-2] with [table-1 table-2]


which performs a highly optimized db-lookup for each replaced value, 
but has the following restrictions:


 a) The lookup expects lookup tables in the form [id label other-column(s)]
	b) Only single-key lookups are supported
	c) A lookup that fails will replace the column value with none!


I'm now in the process of benchmarking these changes against sqlite 
to see where the bottlenecks (if any) are. Feedback on the design 
decisions is welcome.


While I was doing this, I was once again reminded how cumbersome 
it is to construct SQL statements (not just for RebDB, same goes 
for the other SQL protocols), as the heavy use of 'compose, 'rejoin, 
etc adds noise that reduces legibility. The design goal is to provide 
alternatives to:


 sql compose/deep [select * from table where [all [col1 = (val1) col2 
 = (val2)]]]


so for a start the 'sql function should probably accept a string, 
to allow:

	sql join “select * from “ table


type constructs; but this doesn't make the first example easier. 
So how about the 'sql function accept a block containing a string 
statement followed by a number of substitution variables, as in:


 sql reduce [“select * from table where [all [col1 = &1 col2 = &2]]” 
 val1 val2]


which makes things a bit more readable (and shortens the expression 
if longer word names are used multiple times). So the two questions 
here are:

	a) Is this a good idea?

 b) If so, what substitution character (& % $ @ other) will cause 
 the least conflict with REBOL and/or SQL?
Sunanda
11-Feb-2006
[65]
Traditional with embedded SQL, the technique is to use "host variables" 
which start with a colon:

 sql reduce “select * from table where [all [col1 = :var1 col2 = :var2 
 ]]”
And you'd magically replace :var1 with the value of var1.


Which is almost exactly the behaviour you'd expect from :var1 in 
REBOL too.

If you insist that that host variables always have a space before 
and after, that makes the whole substitution process a fairly simple 
parse operation.
Pekr
12-Feb-2006
[66x2]
here's some interesting reading on mySQL optimisations - http://dev.mysql.com/doc/refman/5.0/en/optimization.html
Ashley - why 'replace? Do you want to really replace id with looked-up 
value? I can imagine having some usage for 'id, especially in the 
case where 'id is of some meaning. I know it should not be, but e.g. 
some ppl might use, for companies db, companie's registration number, 
which is not anonymous id, and could be further used in the resultset 
...
Ashley
12-Feb-2006
[68]
Perhaps another keyword, 'appends, that inserts the matching label 
after the column that would otherwise have been replaced. Or, allow 
column names to be specified multiple times in the select clause 
with replacements occurring from the tail, so:

	select [id id date] from orders replaces id with name

might return:

	1 "Bob" 3-Jan-2006
	2 "Fred" 4-Jan-2006
	etc
Pekr
13-Feb-2006
[69x2]
I vote for 'appends keyword ...
Ashley - is that a techical limitation of RebDB design, or more a 
parser problem, that aggregates (master-detail) are supported only 
to second level?
Ashley
13-Feb-2006
[71]
Quick hack to get this one common case working [well]. Proper solution 
(full JOIN support) is a major redesign.
Pekr
13-Feb-2006
[72x2]
ok, thanks ....
well, as RIF is not coming, we are still far away from non-memory 
-only version, right?
Ashley
13-Feb-2006
[74]
Not quite ... see SQLite group ;)
Pekr
13-Feb-2006
[75]
Ashley, I have the question from Bobik - he asks, if proposed changes 
will be implemented anytime soon?
Coccinelle
13-Feb-2006
[76]
sql-protocol don't need these heavy use of 'compose, 'join, 'reduce 
 if you use the dialect. Something like this work :
   var1: "Marco" 
   insert my-db [select * from my-table where name = var1]
   my-result: copy my-db

Another example to ilustrate this :

   insert my-db [select * from my-table where name like rejoin ["*" 
   var1 "*"]]


This is only if you use the dialect to query the database. If you 
use the standard SQL string, you need to compose the query.
Maxim
9-Mar-2006
[77x2]
ashley, I'm starting to use rebdb this week for a very cool segmented 
code versioning project.
wrt simplyfing the use of "noise"  ... why not just call compose/deep 
by default within the 'execute call of the client on ALL sql calls? 
 it could be a global option and IMHO the use of parens within the 
code is quite obvious and is used in many dialects. and its less 
cumbersome than to add the reduce word in your code, a string and 
then variables out of context of the sql statement.
Ashley
9-Mar-2006
[79]
See my post from 12-Feb. The answer to "noise" is substitution variables 
(as it's a form of expression more familiar to long time SQLers). 
Also, I've never liked the idea of performing an operation (be it 
'compose, 'reduce, 'copy, etc) that takes away that choice from the 
coder ... someone may use statements that *never* need to be composed 
and they shouldn't have to carry that cost just because some else 
needs it.
Maxim
9-Mar-2006
[80x2]
I understand... which is why I noted, it could be an option... cause 
although  substitution variables are closer to SQL,  expression embedding 
within parenthesis is quite familiar in rebol ... 


but in any case, We can all just wrap the rebdb calls within our 
own I guess ;-)
and btw... good job on rebdb... for what it does,  I like it a lot.
Thør
2-Apr-2006
[82]
.
Normand
2-Apr-2006
[83x5]
Is it possible to parametrize the rowid? If I read the error message, 
it seems not :
>> pickno: 7
== 7
>> db-select/where * prospectDB [rowid = :pickno]
** Script Error: Cannot use subtract on get-word! value
** Where: switch
** Near: insert tail buffer copy/part skip
>> :pickno
== 7
>> db-select/where * prospectDB [rowid = 7]
== [

    7 "7-02-2006" "Nom-7" "Prénom-7" "Québec" "Cie-7" "Référence: US" 
    "Conjoint-7" "Enfant-7" "Bur: 418-845-7" "Rés: 418-845-7...
>>
Also should I select a row before to delete it or update it?  Reading 
the Quick start guide, il may be implicit from the examples.
The Parenthesis does not work too:
>> db-select/where * prospectDB [rowid = (pickno)]
** Script Error: Cannot use subtract on paren! value
Ashley
2-Apr-2006
[88]
>> db-select/where * prospectDB compose [rowid = (pickno)]
Normand
4-Apr-2006
[89x2]
Thanks that works.  Can someone provide the syntax in db.r of the 
following command in sql.r.  I am looking for the function to update 
all or some 20 fields, loading only db.r in my interface.  The manual 
is sparse, so I tried it in sql.r but it does not give the equivalent 
function.  Is there an echo to toggle to have such translation? It 
would be helpfull to later to discover db.r.  The function I tried 
is :
update tbtst set [id name surname] to [2 "Leclerc2" "Normand2"] where 
[rowid = 2]
Ashley
5-Apr-2006
[91]
help db-update is a good start. ;) For a full list of db functions 
just enter "help db-" at the console.
Maxim
7-Apr-2006
[92x2]
is anyone here interested in a contextualised version of rebdb ? 
 I mean one with everything hidden within an object, and most functions 
fixed to support it?
it allows us to use rebdb by supplying table names within values, 
instead of having to supply them directly within code.
Pekr
7-Apr-2006
[94]
that is not the problem of RebDB imo ... e.g. me and my friend stopped 
using it because of missing 'join and because Ashley stated, that 
instead of reinventing the wheel (RIF based on-disk version), he 
suggests SQLite ....
Maxim
7-Apr-2006
[95]
the only drawback is that sql use needs ticks for table names.   
acter using it for a while I actually like this better, cause it 
clearly identifies what is a reserved word and was is an argument, 
within the statements.
Pekr
7-Apr-2006
[96]
other than that - I like RebDB very much and I regard it being one 
of the best rebol tools so far ....
Maxim
7-Apr-2006
[97x3]
I have a merge function.   :-)
which can do some of the joining.
but you do need to do two queries beforehand... which is not exactly 
a join.. but anyhow you do end up with a block containing interleaved 
data from two tables ... and its pretty quick.
Normand
17-Jun-2006
[100]
Another newbee one: db-select can retreive the rows on multiples 
values like the query  db-select/where * noms [any [nom = 'leclerc 
prenom = 'marion]] but can I query a column's values to find all 
rows where a column contains a substring, like the query db-select/where 
* noms [find ["lec"] nomprenom] ?  This last one is not the answer. 
 What would be the good query for such a purpose?