r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!RebDB] REBOL Pseudo-Relational Database

Pekr
8-Feb-2006
[16x2]
then, when you enter new order into 'orders, you simply assign it 
to company, by using and storing foreign key, so 'company-id ...
And now you have typical problem - you surely don't want your grid 
to show 'company-id, but 'company-name or other related data - but 
those are not present in 'orders database ...
Anton
8-Feb-2006
[18]
Orders

Order-Items

Companies
	company-id, company-name, company-address
?
Pekr
8-Feb-2006
[19x3]
that is imo correct design to simply avoid data inconsistency. Imagine 
if you would put 'company-name into 'oders, and later on company 
changes it's name a bit - you simply want that info to have stored 
only once ...
yes ...
would you store company-name in orders directly? Surely not (although 
we are doing it too for some purposes, as getting grid drawn fast)
Ashley
8-Feb-2006
[22]
Data structure makes sense, what's the problematic SQL statement? 
(in its simplest form)
Pekr
8-Feb-2006
[23x2]
inner join orders, order items, left outer join companies ....
so the inner join? (agregate)
Ashley
8-Feb-2006
[25]
Just write the SQL statement as you would in MySQL, etc ... then 
we can see where the issues are.
Pekr
8-Feb-2006
[26x5]
ok, here's my rebol odbc code for ADS (Advantage Database Server):

insert db-port trim/lines {

   SELECT DISTINCT ko.kodfyzak, fk.nazev, fk.ulice, fk.mesto, fk.psc, 
   fk.kodzeme, zk.land1, upper(fk.kodmeny),
                   upper(e.ico),

                   ko.kodfypri1, fp.nazev, fp.ulice, fp.mesto, fp.psc, fp.kodzeme, zp.land1, 
                   upper(fp.kodmeny)
                      
   FROM zakp_exp e

   INNER JOIN Kontjkv5 k5 ON k5.zakazka = concat(e.miv, left(e.czak, 
   6)) 
   LEFT OUTER JOIN kontrakt ko ON ko.ciskontr = k5.ciskontr

   LEFT OUTER JOIN firma fk ON fk.kodfirmy = ko.kodfyzak  AND fk.aktualni 
   = 'A'

   LEFT OUTER JOIN firma fp ON fp.kodfirmy = ko.kodfypri1 AND fp.aktualni 
   = 'A'
   LEFT OUTER JOIN zem zk ON fk.kodzeme = zk.kodzeme
   LEFT OUTER JOIN zem zp ON fp.kodzeme = zp.kodzeme
   LEFT OUTER JOIN delka_psc psc ON zp.land1 = psc.land1
   ORDER BY fk.nazev, fp.nazev
}
so let's forget those aliases, which are nice things to have too 
:-)
left outer join seems like kind of subselect)
while inner join seems to be kind of aggregate ...
but that is maybe unnecessarily complicated example, we were migrating 
data to SAP R3 from our old systems .....
Ashley
8-Feb-2006
[31]
An example based on what you are trying to do with Orders, Items 
and Companies might be better to start with.
Pekr
8-Feb-2006
[32]
that was just an example simple schema, but well, I may try to write 
short SQL query from that ;-)
Ashley
8-Feb-2006
[33]
Stick to basic JOIN syntax in the form:

	select b.name, b.address
	from a, b
	where a.id = b.id
	and ...


those INNER JOIN and LEFT OUTER JOIN statements are unfamiliar to 
me and I get a headache just looking at them. ;)
Pekr
8-Feb-2006
[34x4]
SELECT o.order-id, o.date, o.amount, o.total, oi.item-id, io.item-price, 
oi.goods-name, c.company-name
FROM orders o
INNER JOIN order-items oi ON oi.order-id =  o.order.id
LEFT OUTER JOIN companies c ON c.company-id = o.company-id
WHERE o.amount > 2000
ORDER BY c.company-name, o.order-id, o-item-id
not sure I got it syntactically 100% correct, but you got the idea 
:-)
so basically select row defines you final format of result block 
...
inner join simply  "adds rows" of order-items per order
Sunanda
8-Feb-2006
[38]
Not sure if I'm off topic here.....But you can do pretty much everything 
in SQL without using JOIN. Though you may need UNION, subselects 
and WHERE.

Which makes for more portable SQL as JOIN syntax often used vendor-specific 
extensions to the SQL-92 standard.
Pekr
8-Feb-2006
[39x2]
outer join just selects company name ...
Yes, that might be possible ... I am not that skilled in SQL yet, 
so maybe it could be done other way ....
Sunanda
8-Feb-2006
[41]
I learned before JOIN existed.....Means you are more skilled than 
me :-)
Pekr
8-Feb-2006
[42x2]
maybe it is a pity rebol can't do union and join on more per record 
base, but per item base ...
that join simply creates one big table for you, which is nice, as 
you simply can traverse in one loop ....
Ashley
8-Feb-2006
[44]
UNION cannot substitute for JOIN. UNION aggregates the rows returned 
by multiple statements while JOIN [can] aggregate the columns returned 
by referring to multiple tables. A subtle but important distinction. 
For example:

Table-A
	Col
	====
	1
	2

Table-B
	Col
	====
	A
	B

	select a.col, b.col from a, b

	1 A
	1 B
	2 A
	2 B

	select * from a union select * from b

	1
	2
	A
	B
Pekr
8-Feb-2006
[45x2]
do not understand the syntax of select from a,b ... actually - never 
understood it :-) our db allows select, the rest is crosslinked via 
joins :-)
we would need some join/key ... to define key on which to join two 
blocks to create third one ... or maybe union/key ...
Gabriele
8-Feb-2006
[47]
ashley, my old dbms3.r supports joins, though probably far from doing 
it efficiently. in case you need the code feel free to use it.
Ashley
8-Feb-2006
[48]
Thanks, I'll take a look at it. (Marco's sql-protocol.r is also a 
good read).
Ashley
9-Feb-2006
[49]
dbms3.r isn't in the library, where can I grab a copy from?
Graham
9-Feb-2006
[50x2]
rebol.it is a good bet.
oohh.. looks like a Cobalt Raq.
Ashley
9-Feb-2006
[52]
Back to Pekr's JOIN problem. First, let's reformat the SQL into something 
more readable:


select o.order-id, o.date, o.amount, o.total, oi.item-id, io.item-price, 
oi.goods-name, c.company-name
from   orders o
,      order-items oi
,      companies c
where  o.order.id = oi.order-id
  and  o.company-id = c.company-id
  and  o.amount > 2000
order by c.company-name, o.order-id, o-item-id
Graham
9-Feb-2006
[53]
not here either http://www.colellachiara.com/soft/
Ashley
9-Feb-2006
[54x2]
Then break it down into discrete queries and wrap it in some loops:

blk: copy []

foreach [company-id order-id date amount total] sql [

 select [company-id order-id date amount total] from orders where 
 [amount > 2000]
][
	company-name: second sql compose [lookup companies (company-id)]

 foreach [item-id item-price goods-name] sql compose [select * from 
 order-items where (order-id)] [

  insert tail blk reduce [order-id date amount total item-id item-price 
  goods-name company-name]
	]
]

sort/skip/compare blk 8 [8 1 5]
The final [untested] solution is about as efficient as you can get. 
Adding JOIN support to RebDB so it can break the query down into 
similar steps is not a simple task. In all but the most trivial of 
cases you'd be better off coding it yourself.


What might be a good idea is to add something that lets you more 
easily specify the most common JOIN operation - master/detail with 
optional LOV (List Of Values) lookup(s). Your query is a classic 
example of this construct and it accounts for a surprisingly large 
number of queries.


The function would accept two queries, a master query (the orders 
table in your case) and a details query (the order-items table) and 
an optional block of column/LOV-table pairs (that perform substitutions 
such as company-name). The skeleton would look like:


 sql-join [master-query [block!] detail-query [block!] /order /lov 
 [block!] ]  [
		buffer: copy []
		...
		buffer
	]

and would be used as such:

	sql-join/order/lov [

  select [company-id order-id date amount total] from orders where 
  [amount > 2000]
	] [
		select * from order-items where %ID%
	] [8 1 5] [company-id companies]

Would this make things a tad easier?
Pekr
9-Feb-2006
[56x5]
not sure your sql query rewrite was much clearer, but - your last 
post makes so much sense!
yes, basically it is typical two cases - master-detail, and lookup!!!
IIRC MySQL docs join section describes very nicely the technique 
of optimisation. And after reading it some few monts ago I agree, 
that 'join is not trivial task at all, basically due to get it optimised. 
But you are right that most of the time those two mentioned cases 
are needed. The thing which complicates the optimisation part is 
- is your column a key? primary key? is it at least indexed? etc.
anyway .... maybe a good question for RT, when finally RIF arrives, 
because if I understand you correctly, till then, RebDB will be memory 
only database (although open/seek is working already)
Graham - I noticed you used Firebird for your medical system - is 
there kind of embedded version as with sqlite?
Coccinelle
9-Feb-2006
[61]
Ashley, sql-protocol generates the code used to extract and join 
the data is grouped in 2 functions :
- make-do-select
- make-do-loop

They receive the column list, the table list (a block of table/alias 
pair), the where code to apply and the database (the port).


If you provide these parameters and change the code to invoke RebDB 
fuction to get the data, you will have a basic join implementation 
for RebDB. You can use it and extend it for RebDB, if you want, I 
will be happy if you do so.
Gabriele
9-Feb-2006
[62x2]
ashley: sent latest dbms3.r via email.
ashley: in your example you are doing a query for each record returned 
in the first query, depending on the overhead of doing a query this 
might be a lot slower than a support for ioins in the dbms itself, 
even without any optimization (my code does not do any optimization, 
the programmer has to do optimization manually)
Ashley
11-Feb-2006
[64]
Thanks guys, I've had a good look at both implementations and I've 
got ideas from both for a future full JOIN implementation; but at 
the moment my master/detail code has come along nicely. I've now 
enhanced the db-select function to accept statements in these additional 
forms:


 select * from master joins [select * from details where &id] on id

 select * from master joins [select * from details where [all [master-id 
 = &id master-date = &date]] on [id date]


which works exactly like a normal join with the following differences:

	a) It can only join one table to another

 b) Detail columns are always joined to the right of master columns

 c) Table.column prefixes are not supported so all columns in the 
 join must be uniquely named


Apart from that you get all the benefits of db-select (can replace 
* with specific column combinations, order and group by on the final 
result set, etc) *and* it's significantly faster than even the raw 
REBOL code example I gave before (as the SQL is parsed once within 
db-select and all loop sub-selects are done in-line).

I've also implemented “lookups” with the following form:

	select * from table replaces id with name
	select * from table replaces [id-1 id-2] with [table-1 table-2]


which performs a highly optimized db-lookup for each replaced value, 
but has the following restrictions:


 a) The lookup expects lookup tables in the form [id label other-column(s)]
	b) Only single-key lookups are supported
	c) A lookup that fails will replace the column value with none!


I'm now in the process of benchmarking these changes against sqlite 
to see where the bottlenecks (if any) are. Feedback on the design 
decisions is welcome.


While I was doing this, I was once again reminded how cumbersome 
it is to construct SQL statements (not just for RebDB, same goes 
for the other SQL protocols), as the heavy use of 'compose, 'rejoin, 
etc adds noise that reduces legibility. The design goal is to provide 
alternatives to:


 sql compose/deep [select * from table where [all [col1 = (val1) col2 
 = (val2)]]]


so for a start the 'sql function should probably accept a string, 
to allow:

	sql join “select * from “ table


type constructs; but this doesn't make the first example easier. 
So how about the 'sql function accept a block containing a string 
statement followed by a number of substitution variables, as in:


 sql reduce [“select * from table where [all [col1 = &1 col2 = &2]]” 
 val1 val2]


which makes things a bit more readable (and shortens the expression 
if longer word names are used multiple times). So the two questions 
here are:

	a) Is this a good idea?

 b) If so, what substitution character (& % $ @ other) will cause 
 the least conflict with REBOL and/or SQL?
Sunanda
11-Feb-2006
[65]
Traditional with embedded SQL, the technique is to use "host variables" 
which start with a colon:

 sql reduce “select * from table where [all [col1 = :var1 col2 = :var2 
 ]]”
And you'd magically replace :var1 with the value of var1.


Which is almost exactly the behaviour you'd expect from :var1 in 
REBOL too.

If you insist that that host variables always have a space before 
and after, that makes the whole substitution process a fairly simple 
parse operation.