r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!RebDB] REBOL Pseudo-Relational Database

Ashley
7-Feb-2006
[1x4]
From !RebGUI group.
<Pekr>
Ashley, just wanted to ask and can't find rebdb group here ...


Isn't it possible to implement 'join? You once said that you will 
wait once RT adds RIF, but that will probably come who knows when 
- it is year and half late already. Do you think it would not be 
possible to proceed without RIF and switching to on-disk storage?


'join is really badly missing with rebdb and I am thinking switching 
to sqlite only because of that one feature. Once you have your data 
spread across many tables, it is difficult to work without it. Or 
how you do it?
</Pekr>
<Jaime>
Pekr. Join is relative easy to implement.


Just do a search on each table, and then create a new block by using 
the key that joins them.

foreach key table1 [
	values-table1: select key table1
	values-table2: select key table2
	append join-table reduce [values-table1 values-table2]
]

That is more or less the pseudo algorithm
</Jaime>
Also note that many join operations can be rewritten as sub-selects, 
as in:

	sql compose/deep [
		select * from a where [
			col < (
				sql [select max [col] from b]
			)
		]
	]

or:

	sql compose/deep/only [
		select * from a where [
			find (sql [select [col] from b]) col
		]
	]
Pekr
8-Feb-2006
[5x2]
Ashley - subselect is not 'join, is it? In y our above second example 
for e.g., it will mean that for each-record in table 'a, there will 
be one subselect ... that is going to be slow, no?
hmm, actually join has to do the same, so :-)
Ashley
8-Feb-2006
[7]
JOIN differs from SUB-SELECT where you want to aggregate the columns 
of more than one table, so:

	select a.col, b.col from a, b


cannot be refactored as a sub-select. There are two reasons why I 
have not implemented JOINs in RebDB [yet]:

1) Dramatic increase in code complexity

2) You can almost always do it more efficiently in REBOL as you *know* 
the data structures and desired result set *in advance*.


About the only time this does not work well [in RebDB] is where you 
have to pull the contents of more than one table across a network 
to derive a small subset as the result set. So while this SQL would 
not suffer:

	select a.col, b.col from a, b

this might:

	select a.key, b.val from a, b where a.key = b.key

depending on the size of b.
Pekr
8-Feb-2006
[8]
yes, but what you describe is more than day-by-day example of proper 
db usage. Even with small projects, when using 3NF notation, you 
simply store only foreign keys in tables, so those "aggregate" functions 
are needed too often ...
Anton
8-Feb-2006
[9x2]
So it looks like the problem is Pekr's expectation that the database 
have certain common functions, often implemented in other databases. 
Switching from one database to another is hard because they don't 
all support the same functions. I think RebDB, as a youthful, new, 
growing database, should not be expected to have everything yet.
But, Petr, perhaps it would be good if you could show the operation 
and tables that you are having trouble with, and Ashley can help 
you find the best way. Maybe the performance will be more than you 
need, and the expression simpler in rebol.
Pekr
8-Feb-2006
[11]
no, Anton, everything is ok, no problems with columns, or switching 
.... I will describe you one example ....
Ashley
8-Feb-2006
[12]
It's called a "Pseudo-Relational Database" for a good reason. ;)


But in answer to Pekr's previous point; *you* can work out *how* 
to do that aggregation more efficiently than the limited AI of most 
RDBMS systems. I've worked at DB2 and Oracle shops where it was mandated 
that all JOINs be performed inline! SQL optimizers are not trivial 
to write, and the meta-data overhead (indexes, statistics, hints, 
etc) required to get them to make the "right" choices are often a 
false economy.
Anton
8-Feb-2006
[13]
So, Petr, you solved the original problem which occurred for you 
?
Pekr
8-Feb-2006
[14x4]
ok, here it is: you have tables called 'orders, 'order-items, 'companies 
.....
in 'companies, you have 'company-id, 'company-name, 'company-address 
....
then, when you enter new order into 'orders, you simply assign it 
to company, by using and storing foreign key, so 'company-id ...
And now you have typical problem - you surely don't want your grid 
to show 'company-id, but 'company-name or other related data - but 
those are not present in 'orders database ...
Anton
8-Feb-2006
[18]
Orders

Order-Items

Companies
	company-id, company-name, company-address
?
Pekr
8-Feb-2006
[19x3]
that is imo correct design to simply avoid data inconsistency. Imagine 
if you would put 'company-name into 'oders, and later on company 
changes it's name a bit - you simply want that info to have stored 
only once ...
yes ...
would you store company-name in orders directly? Surely not (although 
we are doing it too for some purposes, as getting grid drawn fast)
Ashley
8-Feb-2006
[22]
Data structure makes sense, what's the problematic SQL statement? 
(in its simplest form)
Pekr
8-Feb-2006
[23x2]
inner join orders, order items, left outer join companies ....
so the inner join? (agregate)
Ashley
8-Feb-2006
[25]
Just write the SQL statement as you would in MySQL, etc ... then 
we can see where the issues are.
Pekr
8-Feb-2006
[26x5]
ok, here's my rebol odbc code for ADS (Advantage Database Server):

insert db-port trim/lines {

   SELECT DISTINCT ko.kodfyzak, fk.nazev, fk.ulice, fk.mesto, fk.psc, 
   fk.kodzeme, zk.land1, upper(fk.kodmeny),
                   upper(e.ico),

                   ko.kodfypri1, fp.nazev, fp.ulice, fp.mesto, fp.psc, fp.kodzeme, zp.land1, 
                   upper(fp.kodmeny)
                      
   FROM zakp_exp e

   INNER JOIN Kontjkv5 k5 ON k5.zakazka = concat(e.miv, left(e.czak, 
   6)) 
   LEFT OUTER JOIN kontrakt ko ON ko.ciskontr = k5.ciskontr

   LEFT OUTER JOIN firma fk ON fk.kodfirmy = ko.kodfyzak  AND fk.aktualni 
   = 'A'

   LEFT OUTER JOIN firma fp ON fp.kodfirmy = ko.kodfypri1 AND fp.aktualni 
   = 'A'
   LEFT OUTER JOIN zem zk ON fk.kodzeme = zk.kodzeme
   LEFT OUTER JOIN zem zp ON fp.kodzeme = zp.kodzeme
   LEFT OUTER JOIN delka_psc psc ON zp.land1 = psc.land1
   ORDER BY fk.nazev, fp.nazev
}
so let's forget those aliases, which are nice things to have too 
:-)
left outer join seems like kind of subselect)
while inner join seems to be kind of aggregate ...
but that is maybe unnecessarily complicated example, we were migrating 
data to SAP R3 from our old systems .....
Ashley
8-Feb-2006
[31]
An example based on what you are trying to do with Orders, Items 
and Companies might be better to start with.
Pekr
8-Feb-2006
[32]
that was just an example simple schema, but well, I may try to write 
short SQL query from that ;-)
Ashley
8-Feb-2006
[33]
Stick to basic JOIN syntax in the form:

	select b.name, b.address
	from a, b
	where a.id = b.id
	and ...


those INNER JOIN and LEFT OUTER JOIN statements are unfamiliar to 
me and I get a headache just looking at them. ;)
Pekr
8-Feb-2006
[34x4]
SELECT o.order-id, o.date, o.amount, o.total, oi.item-id, io.item-price, 
oi.goods-name, c.company-name
FROM orders o
INNER JOIN order-items oi ON oi.order-id =  o.order.id
LEFT OUTER JOIN companies c ON c.company-id = o.company-id
WHERE o.amount > 2000
ORDER BY c.company-name, o.order-id, o-item-id
not sure I got it syntactically 100% correct, but you got the idea 
:-)
so basically select row defines you final format of result block 
...
inner join simply  "adds rows" of order-items per order
Sunanda
8-Feb-2006
[38]
Not sure if I'm off topic here.....But you can do pretty much everything 
in SQL without using JOIN. Though you may need UNION, subselects 
and WHERE.

Which makes for more portable SQL as JOIN syntax often used vendor-specific 
extensions to the SQL-92 standard.
Pekr
8-Feb-2006
[39x2]
outer join just selects company name ...
Yes, that might be possible ... I am not that skilled in SQL yet, 
so maybe it could be done other way ....
Sunanda
8-Feb-2006
[41]
I learned before JOIN existed.....Means you are more skilled than 
me :-)
Pekr
8-Feb-2006
[42x2]
maybe it is a pity rebol can't do union and join on more per record 
base, but per item base ...
that join simply creates one big table for you, which is nice, as 
you simply can traverse in one loop ....
Ashley
8-Feb-2006
[44]
UNION cannot substitute for JOIN. UNION aggregates the rows returned 
by multiple statements while JOIN [can] aggregate the columns returned 
by referring to multiple tables. A subtle but important distinction. 
For example:

Table-A
	Col
	====
	1
	2

Table-B
	Col
	====
	A
	B

	select a.col, b.col from a, b

	1 A
	1 B
	2 A
	2 B

	select * from a union select * from b

	1
	2
	A
	B
Pekr
8-Feb-2006
[45x2]
do not understand the syntax of select from a,b ... actually - never 
understood it :-) our db allows select, the rest is crosslinked via 
joins :-)
we would need some join/key ... to define key on which to join two 
blocks to create third one ... or maybe union/key ...
Gabriele
8-Feb-2006
[47]
ashley, my old dbms3.r supports joins, though probably far from doing 
it efficiently. in case you need the code feel free to use it.
Ashley
8-Feb-2006
[48]
Thanks, I'll take a look at it. (Marco's sql-protocol.r is also a 
good read).
Ashley
9-Feb-2006
[49]
dbms3.r isn't in the library, where can I grab a copy from?
Graham
9-Feb-2006
[50]
rebol.it is a good bet.