Profiling Rebol API to DyBASE

[1/18] from: knizhnik::garret::ru at: 18-Dec-2003 20:43

Hello Konstantin, Looks there is no profiler in Rebol:( So, I have to do profiling myself using now/time/precise. I got the following results which seems to be interesting: Elapsed time for inserting 100000 records: 0:01:01 Elapsed time for performing 200000 index searches: 0:02:06 Time spent in lookup 0:01:24.312 Time spent in lookup/select 0:00:53.348 Time spent in lookup/make 0:00:11.418 Time spent in lookup/insert 0:00:01.501 Time spent in lookup/fetch 0:00:11.401 Time spent in find 0:01:58.464 Time spent in find/iterator creation 0:00:14.281 Time spent in find/append 0:01:36.118 Elapsed time for iterating through 200000 records: 0:03:02 Time spent in lookup 0:02:54.861 Time spent in lookup/select 0:01:56.736 Time spent in lookup/make 0:00:23.251 Time spent in lookup/insert 0:00:03.774 Time spent in lookup/fetch 0:00:23.006 Elapsed time for deleting 100000 records: 0:03:53 Time spent in lookup 0:01:52.034 Time spent in lookup/select 0:01:21.9 Time spent in lookup/make 0:00:11.587 Time spent in lookup/insert 0:00:01.703 Time spent in lookup/fetch 0:00:12.389 So, during index search 1.5 minutes from 2 were spent in lookup function. And 14 seconds takes searching index itself. From these 1.5 minutes most of the time was spent in this line: obj: select obj-by-oid-map oid I wrote a large number of test scripts experimenting with hashes, but could not find the reason of the problem - 100000 searches can be executed for less then one second. Below is text of _lookup-function with inserted profiling statement. May be somebody can explain to me such strange behavior? obj-by-oid-map maps OID (integer number - object identifier) to object instances. It is periodically cleared (using "clear obj-by-oid-map"), so number of elements in hash should not be larger than 100. And select should always return none. _lookup-object: function[oid recursive] [h this obj hnd class-name field-name start start2] [ either oid = 0 [ none ] [ start: now/time/precise obj: select obj-by-oid-map oid elapsed-select: elapsed-select + (now/time/precise - start) either not obj [ start2: now/time/precise hnd: dybase_begin_load db oid cls: to word! dybase_get_class_name hnd this: self obj: make get cls [__oid__: oid __storage__: this __class__: cls] if not recursive [ either obj/recursive-loading? [ recursive: true ] [ obj/__raw__: true ] ] elapsed-make: elapsed-make + (now/time/precise - start2) start2: now/time/precise insert insert tail obj-by-oid-map oid obj elapsed-insert: elapsed-insert + (now/time/precise - start2) start2: now/time/precise either find [string-index integer-index decimal-index] cls [ dybase_next_field hnd obj/index: dybase_get_ref hnd dybase_next_field hnd ] [ while [not empty? field-name: dybase_next_field hnd] [ set in obj (to word! field-name) _fetch-component hnd recursive ] if recursive [obj/on-load] ] elapsed-fetch: elapsed-fetch + (now/time/precise - start2) ] [ if recursive and obj/__raw__ [load-object obj] ] elapsed-lookup: elapsed-lookup + (now/time/precise - start) obj ] ] -- Thanks in advance, Konstantin mailto:[knizhnik--garret--ru]

[2/18] from: greggirwin:mindspring at: 18-Dec-2003 13:46

Hi Konstantin, KK> So, during index search 1.5 minutes from 2 were spent in lookup KK> function. And 14 seconds takes searching index itself. KK> From these 1.5 minutes most of the time was spent in this line: KK> obj: select obj-by-oid-map oid A quick test seems to show that the SELECT part of the lookup is faster for smaller numbers of records (e.g. 10,000), and gets progressively slower as the numbers increase. That is, it starts out faster than the FETCH and MAKE parts and ends up a lot slower than they are with larger numbers of records. -- Gregg

[3/18] from: knizhnik:garret:ru at: 19-Dec-2003 1:22

Hello Gregg, I was able to isolate the problem. The following script shows almost the same time as testindex.r searching for 200000 objects. n: 200000 h: make hash! n start: now/time/precise repeat i n [ oid: random n obj: select h oid if none? obj [ obj: make object! [__oid__: oid] insert insert tail h oid obj ] if (i // 100) = 0 [clear h] ] print ["Elapsed time for adding" n "records" (now/time/precise - start)] At my computer execution of this script takes about 70 seconds. By replacing it with: n: 200000 h: make hash! n l: make block! n start: now/time/precise repeat i n [ oid: random n pos: find h oid either none? pos [ obj: make object! [__oid__: oid] insert tail h oid insert tail l obj ] [obj: pick l index? pos] if (i // 100) = 0 [clear h clear l] ] print ["Elapsed time for adding" n "records" (now/time/precise - start)] I was able to reduce execution time till 33 seconds. Are there some better ideas how to improve performance of this peace of code? Thursday, December 18, 2003, 11:46:41 PM, you wrote: GI> Hi Konstantin, KK>> So, during index search 1.5 minutes from 2 were spent in lookup KK>> function. And 14 seconds takes searching index itself. KK>> From these 1.5 minutes most of the time was spent in this line: KK>> obj: select obj-by-oid-map oid GI> A quick test seems to show that the SELECT part of the lookup is GI> faster for smaller numbers of records (e.g. 10,000), and gets GI> progressively slower as the numbers increase. That is, it starts out GI> faster than the FETCH and MAKE parts and ends up a lot slower than GI> they are with larger numbers of records. GI> -- Gregg -- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[4/18] from: knizhnik:garret:ru at: 19-Dec-2003 1:45

I improve my own "record". Looks like the best solution (not only for this example) is: n: 200000 h: make hash! 101 cache-size: 101 cache-used: 0 start: now/time/precise repeat i n [ oid: random n obj: select h oid if none? obj [ obj: make object! [__oid__: oid] if cache-used = cache-size [ cache-size: cache-size * 2 new-hash: make hash! cache-size insert tail new-hash h h: new-hash ] insert insert tail h oid obj cache-used: cache-used + 1 ] if (i // 100) = 0 [h: make hash! 101 cache-size: 101 cache-used: 0] ] print ["Elapsed time for adding" n "records" (now/time/precise - start)] Friday, December 19, 2003, 1:22:13 AM, you wrote: KK> Hello Gregg, KK> I was able to isolate the problem. KK> The following script shows almost the same time as testindex.r KK> searching for 200000 objects. KK> n: 200000 KK> h: make hash! n KK> start: now/time/precise KK> repeat i n [ KK> oid: random n KK> obj: select h oid KK> if none? obj [ KK> obj: make object! [__oid__: oid] KK> insert insert tail h oid obj KK> ] KK> if (i // 100) = 0 [clear h] KK> ] KK> print ["Elapsed time for adding" n "records" (now/time/precise - start)] KK> At my computer execution of this script takes about 70 seconds. KK> By replacing it with: KK> n: 200000 KK> h: make hash! n KK> l: make block! n KK> start: now/time/precise KK> repeat i n [ KK> oid: random n KK> pos: find h oid KK> either none? pos [ KK> obj: make object! [__oid__: oid] KK> insert tail h oid KK> insert tail l obj KK> ] [obj: pick l index? pos] KK> if (i // 100) = 0 [clear h clear l] KK> ] KK> print ["Elapsed time for adding" n "records" (now/time/precise - start)] KK> I was able to reduce execution time till 33 seconds. KK> Are there some better ideas how to improve performance of this peace KK> of code? KK> Thursday, December 18, 2003, 11:46:41 PM, you wrote: GI>> Hi Konstantin, KK>>> So, during index search 1.5 minutes from 2 were spent in lookup KK>>> function. And 14 seconds takes searching index itself. KK>>> From these 1.5 minutes most of the time was spent in this line: KK>>> obj: select obj-by-oid-map oid GI>> A quick test seems to show that the SELECT part of the lookup is GI>> faster for smaller numbers of records (e.g. 10,000), and gets GI>> progressively slower as the numbers increase. That is, it starts out GI>> faster than the FETCH and MAKE parts and ends up a lot slower than GI>> they are with larger numbers of records. GI>> -- Gregg KK> -- KK> Best regards, KK> Konstantin mailto:[knizhnik--garret--ru] -- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[5/18] from: rotenca:telvia:it at: 19-Dec-2003 4:49

Hi,

> I was able to isolate the problem. > The following script shows almost the same time as testindex.r

<<quoted lines omitted: 12>>

> ] > print ["Elapsed time for adding" n "records" (now/time/precise - start)]

1) For what i understand, you allocate a 200.000 slot hash, then clear the cache every 100 item. Why? 2) The last thing: time consuming task here is clear. 3) Another thing: if you make hash! n then you should insert n item not n * 2 (with insert insert) Remember that Rebol uses internal and hidden keys to hash. Your oid value is internally hashed like any other value. --- Ciao Romano

[6/18] from: dockimbel:free at: 19-Dec-2003 10:58

Hi Konstantin, Here's a version executing 10 times faster. I just changed h series type from hash! to list!. Looks like in your case, cost for adding data is much higher than for searching keys... Regards, -DocKimbel n: 200000 h: make list! n l: make block! n start: now/time/precise repeat i n [ oid: random n pos: find h oid either pos [ obj: pick l index? pos ][ obj: make object! [__oid__: oid] insert tail h oid insert tail l obj ] if zero? i // 100 [clear h clear l] ] print ["Elapsed time for adding" n "records" (now/time/precise - start)] Selon Konstantin Knizhnik <[knizhnik--garret--ru]>:

[7/18] from: rotenca:telvia:it at: 19-Dec-2003 12:51

Hi, Just seem that clear waste hash table, perhaps this is a bug in clear. These are my tests:

>> i: 100000 loop 5 [recycle s: now/precise h: make hash! i * 2 + 2 repeat n i

[find h n insert insert tail h n make object! [a: 1] if 99 = (n // 100)[clear h]] print [i difference now/precise s] i: i + 25000] 100000 0:00:12.97 125000 0:00:19.28 150000 0:00:26.86 175000 0:00:35.37 200000 0:00:47.01

>> i: 100000 loop 5 [recycle s: now/precise h: make hash! i * 2 + 2 repeat n i

[find h n insert insert tail h n make object! [a: 1]] print [i difference now/precise s] i: i + 25000] 100000 0:00:03.96 125000 0:00:05.33 150000 0:00:06.98 175000 0:00:09.06 200000 0:00:09.83 --- Ciao Romano

[8/18] from: knizhnik:garret:ru at: 19-Dec-2003 18:26

Hello Romano, Friday, December 19, 2003, 2:51:19 PM, you wrote: RPT> Hi, RPT> Just seem that clear waste hash table, perhaps this is a bug in clear. Yes, I already noticed it. So it is much cheaper to create new object than clean existed. RPT> These are my tests:

>>> i: 100000 loop 5 [recycle s: now/precise h: make hash! i * 2 + 2 repeat n i

RPT> [find h n insert insert tail h n RPT> make object! [a: 1] if 99 = (n // 100)[clear h]] print [i difference RPT> now/precise s] i: i + 25000] RPT> 100000 0:00:12.97 RPT> 125000 0:00:19.28 RPT> 150000 0:00:26.86 RPT> 175000 0:00:35.37 RPT> 200000 0:00:47.01

>>> i: 100000 loop 5 [recycle s: now/precise h: make hash! i * 2 + 2 repeat n i

RPT> [find h n insert insert tail h n RPT> make object! [a: 1]] print [i difference now/precise s] i: i + 25000] RPT> 100000 0:00:03.96 RPT> 125000 0:00:05.33 RPT> 150000 0:00:06.98 RPT> 175000 0:00:09.06 RPT> 200000 0:00:09.83 RPT> --- RPT> Ciao RPT> Romano RPT> ----- Original Message ----- RPT> From: "Romano Paolo Tenca" <[rotenca--telvia--it]> RPT> To: <[rebol-list--rebol--com]> RPT> Sent: Friday, December 19, 2003 4:49 AM RPT> Subject: [REBOL] Re: Profiling Rebol API to DyBASE

>> >> Hi,

<<quoted lines omitted: 25>>

>> >> 3) Another thing: if you make hash! n then you should insert n item not n *

RPT> 2

>> (with insert insert) >>

<<quoted lines omitted: 9>>

>> [rebol-request--rebol--com] with unsubscribe as the subject. >>

-- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[9/18] from: doug:vos:eds at: 19-Dec-2003 16:23

Result of latest DocKimbel version on my Dell laptop...

[10/18] from: robert:muench:robertmuench at: 20-Dec-2003 14:27

On Fri, 19 Dec 2003 01:45:05 +0300, Konstantin Knizhnik <[knizhnik--garret--ru]> wrote:

> I improve my own "record". > Looks like the best solution (not only for this example) is: > ...

Hi, for this version I get: Elapsed time for adding 200000 records 0:00:01.452 -- Robert M. M�nch Management & IT Freelancer Mobile: +49 (177) 245 2802 http://www.robertmuench.de

[11/18] from: knizhnik:garret:ru at: 20-Dec-2003 17:04

Hello Robert, With the help of Rebol mail list members, I was able to significantly improve performance of Rebol API. Now the difference with for example Python, is not so high. I do not believe that it is possible to do something more in improving performance. Certainly it is possible to speed up testlink.r (for example by disabling garbage collector), but such "optimization" is not useful. The problem with hash table performance is more or less fixed. There is also yet another problem with Rebol - objects seems to be stored very inefficiently. Creating 100000 objects with 18 fields (as in testlink.r example), increase size of rebol.exe till 200Mb. You can reproduce it if you remove reset-hash in testindex.r, which cause all 100000 objects to be present in memory. It will significantly slow down application and you can see a lot of page faults for Rebol process in Task Manager. So to achive good performance you should avoid to load a large number of persistent objects from the database (by periodic hash reset). -- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[12/18] from: greggirwin:mindspring at: 20-Dec-2003 10:02

Hi Konstantin, Thanks for your continued efforts! Thanks to your code, and Romano's detective work, it seems a bug in CLEAR with hash! values may have been found, which is great! Also, some of us have talked about things a bit, and an expert opinion is that REBOL's prototype object approach just isn't a great fit for DyBase, so things may never be as good as we would like in that regard. Having it work, though, and be much faster now, will still be very useful. KK> There is also yet another problem with Rebol - objects seems to be stored KK> very inefficiently. It's also creating them that has a lot of overhead. Someone noted during a discussion that if you're creating 100,000 objects with 10 functions each, that you're creating, and binding, 1,000,000 functions. Two ideas that have been mentioned are 1) use block! values instead of objects to interface with DyBase 2) Break the functions out of the objects into a supporting context, or as free functions, so you don't have to create and bind them with every object. Either one may require work, but could be worth investigating if someone really needs the potential improvement. I'm looking forward to trying the latest release. Thanks again! -- Gregg

[13/18] from: knizhnik:garret:ru at: 20-Dec-2003 21:16

Hello Gregg, Saturday, December 20, 2003, 8:02:18 PM, you wrote: GI> Hi Konstantin, GI> Thanks for your continued efforts! Thanks to your code, and Romano's GI> detective work, it seems a bug in CLEAR with hash! values may have GI> been found, which is great! GI> Also, some of us have talked about things a bit, and an expert opinion GI> is that REBOL's prototype object approach just isn't a great fit for GI> DyBase, so things may never be as good as we would like in that GI> regard. Having it work, though, and be much faster now, will still be GI> very useful. KK>> There is also yet another problem with Rebol - objects seems to be stored KK>> very inefficiently. GI> It's also creating them that has a lot of overhead. Someone noted GI> during a discussion that if you're creating 100,000 objects with 10 GI> functions each, that you're creating, and binding, 1,000,000 GI> functions. This is really awful. I also have noticed that when application creates about 100000 objects, the size of process exceeds 200Mb, and number of p[age faults is increased very fast. I thought, that once objet is created using prototype object, the values of prototype object fields are just copied to the new objects (doesn't matter whether them are scalars, series or functions). In this case there will be only one instance of the function compiled and bounded only once. But looks like it is not the case. So serious OO programming in Rebol is not possible and I have to look for alternative solution for adding persistency to Rebol. Unfortunately, DyBASE is object oriented database - i.e. it is oriented on work with objects. Object is assumed as set of fields with data and set of methods. Data is stored in DyBASE, methods - not (to make it possible to easily fix error in the program). Each object field has name and value. Objects with the same set of fields and methods belongs to one class. To minimize overhead, DyBASE create one class descriptor and all object instances, belonging to the class, contains reference to this class descriptor. Class descriptor contains class name and list of field names. Arrays and hashes are considered in DyBASE as values - so them are stored inside object and if N objects refer the same array or hash, N instances will be stored (and loaded next time when objects are accessed). Rebol objects fits in this model (the only problem is that there are no classes, so I will have to make programmer to specify prototype object for the class). But if we provide persistency not for objects, but for blocks, then it is not clear to me how to store them. There will be no field names (but will be "hidden" fields, like current position name in variables referencing series). So it is possible to write normal serialization mechanism for Rebol (one that will preserve relations between objects), and even create database, which allows not only load/store, but also change/retrieve the data. But it could not be done within DyBASE object model. First task is significantly simpler, and if such module will be really interesting to many Rebol users, I can try to develop one. GI> Two ideas that have been mentioned are GI> 1) use block! values instead of objects to interface with DyBase GI> 2) Break the functions out of the objects into a supporting GI> context, or as free functions, so you don't have to create and GI> bind them with every object. GI> Either one may require work, but could be worth investigating if GI> someone really needs the potential improvement. GI> I'm looking forward to trying the latest release. GI> Thanks again! GI> -- Gregg -- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[14/18] from: nitsch-lists:netcologne at: 20-Dec-2003 23:36

Hi Konstantin, Am Samstag 20 Dezember 2003 19:16 schrieb Konstantin Knizhnik:

> Hello Gregg, > > Saturday, December 20, 2003, 8:02:18 PM, you wrote: > > GI> Hi Konstantin, > > GI> Thanks for your continued efforts! Thanks to your code, and Romano's > GI> detective work, it seems a bug in CLEAR with hash! values may have > GI> been found, which is great! >

clear its a bug! (SCNR :) (More serious text below)

> GI> Also, some of us have talked about things a bit, and an expert opinion > GI> is that REBOL's prototype object approach just isn't a great fit for

<<quoted lines omitted: 15>>

> In this case there will be only one instance of the function compiled > and bounded only once. But looks like it is not the case.

Right. first, the typical solution: Make your own vtable, and put that in the objects. (vtable: c++ for "table of method-references"). This is the trick rebol itself uses in some places. Short the cloning-rules of [make object!]: - series are cloned, also functions. - objects are shared by reference. So the right thing for a vtable is an object. Now some ultra-fresh code from the beta 1.2.17 (changes daily currently). Here Carl Sassenrath introduces a new vtable. The object is a 'face, the vtable called face/access. lets see.. !>> demo-face: system/view/vid/vid-face ;the base-class !>> probe demo-face/access ; The vtable. make object! [ set-face*: func [face value][face/data: value] get-face*: func [face][face/data] clear-face*: func [face][face/data: false] reset-face*: func [face][face/data: false] ] You see, calling that is ugly. we would need demo-face/access/get-face* demo-face to get something.But this is optimizing stuff, not for daily use. More for stuff like DyBase ;) So we add global accessor-functions, basically "method-senders" then we can write get-face demo-face !>>source get-face get-face: func [ "Returns the primary value of a face." face /local access ][ if all [ access: get in face 'access in access 'get-face* ] [ access/get-face* face ] ] This one also checks if the face has an /access at all and acts as nop otherwise. The essence of it is: get-face: func [ face ] [ face/access/get-face* face ] And now we have good memory-performance, because the face itself does not clone this methods. good usability by the wrapper 'get-face. And if we need speed, we can ignore the wrapper and call the ugly way.

> So serious OO programming in Rebol is not possible and I have to look > for alternative solution for adding persistency to Rebol.

It is, as shown above. Because "not possible" sounds soo depressing, i try explain why rebol does it this way. In rebol we can do global-head: none demo: func[s /local local-tail][ parse/all s [ copy global-head to " " skip copy local-tail to end ] reduce[head tail] ; return something ] Here 'parse is a dialect. It gets the block [ copy global-head to " " skip copy local-tail to end ] and interprets it in a quite diffent a different way than rebol. the 'copy for example. Now 1) before execution Rebol does not know if parse is a dialect or function. So it can not prepare something for it at "compile-time". 2) in this block we use two contexts, 'global-head is in one, 'local-tail in another. So it does not help to pass the object in a hidden variable 'self, like other OO-languages do. 3) Sometimes dialect-code is generated dynamically and often, and supersmart compilers would be no real option. For example in view-dialect for guis. view layout[ ta: area 400x200 return sl: slider [scroll-para ta sl ] 16x200 ] ;(hope this works. not tested.. the [view layout] is interpreted as rebol-code, the following block is interpreted by 'layout to create a gui. 'layout sets the words 'ta and 'sl. And it knows nothing about the context of them, as with 'parse, they could be wildly mixed from different contexts. And the [scroll-para ta sl ] is rebol-interpreted again. stored by 'layout somewhere and called when slider moves. So how knows this dialects where to look? Because each word knows its context. simply [set 'word 15] and the right context is choosen. But what happens now? class: context: [ v: none m: func[s][parse s[copy v to end] ] instance: make class [] now [instance/m "Hello"] calls parse. and parse sets 'v. fine. - which 'v? the global? the one in 'm? or in 'class? lets see, no 'self-variable used, the words are bound. [class/m "Hello"] would set [class/v]. If we would simply share it, [instance/m "Hello"] would set [class/v] to!! It would not know better. To avoid that, rebol clones all functions and rebinds all words in the old context to the new. so [instance/m] is a clone of [class/m], but modified. all [v] in it are now bound to [instance/v]. And now 'parse can do it right. :) Is overhead, yes. But is also lots of comfort when used right. And, after all, a lot of mind-bogling fun sometimes. ('layout is written in rebol btw, and if you want to see its source, simply [source layout]. (Hmm, better don't, there are shorter examples of dialects..) You get a lot of source in this "closed source" interpreter that way. ;) (btw2: because of rebols dynamics, it would be possible to generate the wrapper-functions automatic to save some work. When generating more complex code than that, binding suddenly becomes handy. ;)

> Unfortunately, DyBASE is object oriented database - i.e. it is > oriented on work with objects. Object is assumed as set of fields with

<<quoted lines omitted: 9>>

> N instances will be stored (and loaded next time when objects are > accessed).

You can create the data-object and plug in the vtable later of course. Instead of the current class-name you could store the vtable there. And store the class-name in the vtable if you need it. BTW, Carl mentioned, due to an internal sharing-trick objects are as good as blocks, memory-wise. They share the names internally, only the data is per instance.

> Rebol objects fits in this model (the only problem is that there are > no classes, so I will have to make programmer to specify prototype

<<quoted lines omitted: 10>>

> GI> Two ideas that have been mentioned are > GI> 1) use block! values instead of objects to interface with DyBase

Carl says equal. Was surprised too :)

> GI> 2) Break the functions out of the objects into a supporting > GI> context, or as free functions, so you don't have to create and > GI> bind them with every object. >

Hope my explanations makes sense..

> GI> Either one may require work, but could be worth investigating if > GI> someone really needs the potential improvement. > > GI> I'm looking forward to trying the latest release. > > GI> Thanks again! > > GI> -- Gregg

HTH. And hopefully you get some fun with rebol in return for your work! :) -Volker

[15/18] from: robert:muench:robertmuench at: 21-Dec-2003 14:41

On Sat, 20 Dec 2003 21:16:01 +0300, Konstantin Knizhnik <[knizhnik--garret--ru]> wrote:

> I thought, that once objet is created using prototype object, the > values of prototype object fields are just copied to the new objects > (doesn't matter whether them are scalars, series or functions). > In this case there will be only one instance of the function compiled > and bounded only once. But looks like it is not the case.

Hi, that's the difference in Rebol. It's not an OO language and it's object! datatype isn't OO related. Every word carries its own context. That's why you get the same words bound to different (new) contexts each time. The solution is to move the functions out of the context.

> Rebol objects fits in this model (the only problem is that there are > no classes, so I will have to make programmer to specify prototype > object for the class). But if we provide persistency not for objects, > but for blocks, then it is not clear to me how to store them. > There will be no field names (but will be "hidden" fields, like > current position name in variables referencing series).

Here is a rough sketch of an idea: A database could look like this: mydb: [ data [...] indices [...] db-meta-data [...]] Making blocks persisten will add an OID and a block reference to mybd/data as name/value pairs mydb: [ data [OID1 mydata1] ...] To index data there are serveral ways: 1) let the programmer add the fileds and the DB only maintains the mapping value/OID or 2) I'm registering which fields should be indexed. mydb: [ to-index [name street city ...] ...] The make-persisten function will look into mydb/to-index and check if the block-to-be-made-persisten contains any of the listed fields. If so it takes the value and the new OID and build the index. IMO the Rebol way to persistence is to dissolve that we need a fixed object structure. What we need is a lookup mechanism to answer things like: Give me all blocks that contain a 'name field that has a value in the range from A to B. What kind of blocks the result will be doesn't matter to the DB. The programmer needs to be aware to structure this. For example: I can add an internal field to my blocks to specify a class. What I'm going to do with this information is up to me and my program. mydata1: [_class address/germany name "" street "" city "" zip ""] mydata2: [_class address/usa ...] Both can have different field sets but the overall semantics is an 'address to my program. I hope you got the idea.

> So it is possible to write normal serialization mechanism for Rebol > (one that will preserve relations between objects), and even create > database, which allows not only load/store, but also change/retrieve > the data.

That sounds good to me! It should (in the first run) only be a persistent storage.

> But it could not be done within DyBASE object model.

That's a pitty. I looked at DyBASE because it's intended for scripting lanugages and I though maybe it fits the Rebol world.

> First task is significantly simpler, and if such module will be really > interesting to many Rebol users, I can try to develop one.

I vote for it! We than can see how far we can go and what features need to be added to the persistens engine or to the Rebol code side. What do you think? -- Robert M. M�nch Management & IT Freelancer Mobile: +49 (177) 245 2802 http://www.robertmuench.de

[16/18] from: jason:cunliffe:verizon at: 21-Dec-2003 11:39

Robert M. M�nch wrote:

> Hi, that's the difference in Rebol. It's not an OO language and it's > object! datatype isn't OO related. Every word carries its own context. > That's why you get the same words bound to different (new) contexts each > time. The solution is to move the functions out of the context.

--snip-- Great post. Thank you :-) - Jason

[17/18] from: knizhnik:garret:ru at: 22-Dec-2003 0:32

Hello Robert, I have released new version of DyBASE (0.16 ). Graham Chiu implemented in Rebol program which used DyBASE to store extracted mails from the server. It detects one problem in DyBASE related with storing large strings. That is why I have to release new version. It also includes new version of Rebol API: - generic index factory implemented by Gregg Irwin - I made all methods from persistent external functions. So now nstead of doing "obj/modify" you should use "modify obj". The name of all methods is preserved, except "load" method which is removed to load-persistent (to avoid conflict with system function). I hope that other methods (store, modify, deallocate,...) will not cause conflicts. In future if Rebol will provide efficient mechanism for methods, I return these methods back to persistent class. And right now it allows to reduce overhead of object prototyping and save a lot of memory. Now it is programmers choice whether to use object methods, external functions or use special class with method definitions. Also I have updated comparison table using as reference system my home computer. Result are different but in general ration of performance of different languages is the same. Sunday, December 21, 2003, 4:41:53 PM, you wrote: RMM> On Sat, 20 Dec 2003 21:16:01 +0300, Konstantin Knizhnik RMM> <[knizhnik--garret--ru]> wrote:

>> I thought, that once objet is created using prototype object, the >> values of prototype object fields are just copied to the new objects >> (doesn't matter whether them are scalars, series or functions). >> In this case there will be only one instance of the function compiled >> and bounded only once. But looks like it is not the case.

RMM> Hi, that's the difference in Rebol. It's not an OO language and it's RMM> object! datatype isn't OO related. Every word carries its own context. RMM> That's why you get the same words bound to different (new) contexts each RMM> time. The solution is to move the functions out of the context.

>> Rebol objects fits in this model (the only problem is that there are >> no classes, so I will have to make programmer to specify prototype >> object for the class). But if we provide persistency not for objects, >> but for blocks, then it is not clear to me how to store them. >> There will be no field names (but will be "hidden" fields, like >> current position name in variables referencing series).

RMM> Here is a rough sketch of an idea: RMM> A database could look like this: RMM> mydb: [ data [...] indices [...] db-meta-data [...]] RMM> Making blocks persisten will add an OID and a block reference to mybd/data RMM> as name/value pairs RMM> mydb: [ data [OID1 mydata1] ...] RMM> To index data there are serveral ways: 1) let the programmer add the RMM> fileds and the DB only maintains the mapping value/OID or 2) I'm RMM> registering which fields should be indexed. RMM> mydb: [ to-index [name street city ...] ...] RMM> The make-persisten function will look into mydb/to-index and check if the RMM> block-to-be-made-persisten contains any of the listed fields. If so it RMM> takes the value and the new OID and build the index. RMM> IMO the Rebol way to persistence is to dissolve that we need a fixed RMM> object structure. What we need is a lookup mechanism to answer things RMM> like: Give me all blocks that contain a 'name field that has a value in RMM> the range from A to B. What kind of blocks the result will be doesn't RMM> matter to the DB. The programmer needs to be aware to structure this. RMM> For example: I can add an internal field to my blocks to specify a class. RMM> What I'm going to do with this information is up to me and my program. RMM> mydata1: [_class address/germany name "" street "" city "" zip ""] RMM> mydata2: [_class address/usa ...] RMM> Both can have different field sets but the overall semantics is an RMM> 'address to my program. RMM> I hope you got the idea.

>> So it is possible to write normal serialization mechanism for Rebol >> (one that will preserve relations between objects), and even create >> database, which allows not only load/store, but also change/retrieve >> the data.

RMM> That sounds good to me! It should (in the first run) only be a persistent RMM> storage.

>> But it could not be done within DyBASE object model.

RMM> That's a pitty. I looked at DyBASE because it's intended for scripting RMM> lanugages and I though maybe it fits the Rebol world.

>> First task is significantly simpler, and if such module will be really >> interesting to many Rebol users, I can try to develop one.

RMM> I vote for it! We than can see how far we can go and what features need to RMM> be added to the persistens engine or to the Rebol code side. What do you RMM> think? RMM> -- RMM> Robert M. M�nch RMM> Management & IT Freelancer RMM> Mobile: +49 (177) 245 2802 RMM> http://www.robertmuench.de -- Best regards, Konstantin mailto:[knizhnik--garret--ru]

[18/18] from: robert:muench:robertmuench at: 22-Dec-2003 9:31

On Mon, 22 Dec 2003 00:32:02 +0300, Konstantin Knizhnik <[knizhnik--garret--ru]> wrote:

> I made all methods from persistent external functions. So now nstead > of doing "obj/modify" you should use "modify obj". The name of all

<<quoted lines omitted: 4>>

> mechanism for methods, I return these methods back to persistent > class.

Hi, you can avoid name clashes today too. Just put the functions into an own context. dybase: context [...] Than we can use those functions like 'dybase/load and you don't have to move them back into the persistent context as this will never make any sense in Rebol. The "problem" we have seen, is not an implementation issue of Rebol instead it's a design issue of the interface. This will never change, because this is an intended feature of Rebol. -- Robert M. M�nch Management & IT Freelancer Mobile: +49 (177) 245 2802 http://www.robertmuench.de

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted