[REBOL] Re: Profiling Rebol API to DyBASE
From: nitsch-lists:netcologne at: 20-Dec-2003 23:36
Hi Konstantin,
Am Samstag 20 Dezember 2003 19:16 schrieb Konstantin Knizhnik:
> Hello Gregg,
>
> Saturday, December 20, 2003, 8:02:18 PM, you wrote:
>
> GI> Hi Konstantin,
>
> GI> Thanks for your continued efforts! Thanks to your code, and Romano's
> GI> detective work, it seems a bug in CLEAR with hash! values may have
> GI> been found, which is great!
>
clear its a bug! (SCNR :)
(More serious text below)
> GI> Also, some of us have talked about things a bit, and an expert opinion
> GI> is that REBOL's prototype object approach just isn't a great fit for
> GI> DyBase, so things may never be as good as we would like in that
> GI> regard. Having it work, though, and be much faster now, will still be
> GI> very useful.
>
> KK>> There is also yet another problem with Rebol - objects seems to be
> stored KK>> very inefficiently.
>
> GI> It's also creating them that has a lot of overhead. Someone noted
> GI> during a discussion that if you're creating 100,000 objects with 10
> GI> functions each, that you're creating, and binding, 1,000,000
> GI> functions.
>
> This is really awful. I also have noticed that when application
> creates about 100000 objects, the size of process exceeds 200Mb, and
> number of p[age faults is increased very fast.
> I thought, that once objet is created using prototype object, the
> values of prototype object fields are just copied to the new objects
> (doesn't matter whether them are scalars, series or functions).
> In this case there will be only one instance of the function compiled
> and bounded only once. But looks like it is not the case.
>
Right. first, the typical solution:
Make your own vtable, and put that in the objects.
(vtable: c++ for "table of method-references").
This is the trick rebol itself uses in some places.
Short the cloning-rules of [make object!]:
- series are cloned, also functions.
- objects are shared by reference.
So the right thing for a vtable is an object.
Now some ultra-fresh code from the beta 1.2.17 (changes daily currently).
Here Carl Sassenrath introduces a new vtable.
The object is a 'face, the vtable called face/access. lets see..
!>> demo-face: system/view/vid/vid-face ;the base-class
!>> probe demo-face/access ; The vtable.
make object! [
set-face*: func [face value][face/data: value]
get-face*: func [face][face/data]
clear-face*: func [face][face/data: false]
reset-face*: func [face][face/data: false]
]
You see, calling that is ugly. we would need
demo-face/access/get-face* demo-face
to get something.But this is optimizing stuff, not for daily use.
More for stuff like DyBase ;)
So we add global accessor-functions, basically "method-senders"
then we can write
get-face demo-face
!>>source get-face
get-face: func [
"Returns the primary value of a face."
face
/local access
][
if all [
access: get in face 'access
in access 'get-face*
] [
access/get-face* face
]
]
This one also checks if the face has an /access at all and acts as nop
otherwise. The essence of it is:
get-face: func [ face ] [
face/access/get-face* face
]
And now we have good memory-performance, because the face itself does not
clone this methods.
good usability by the wrapper 'get-face.
And if we need speed, we can ignore the wrapper and call the ugly way.
> So serious OO programming in Rebol is not possible and I have to look
> for alternative solution for adding persistency to Rebol.
It is, as shown above. Because "not possible" sounds soo depressing,
i try explain why rebol does it this way.
In rebol we can do
global-head: none
demo: func[s /local local-tail][
parse/all s [ copy global-head to " " skip copy local-tail to end ]
reduce[head tail] ; return something
]
Here 'parse is a dialect.
It gets the block [ copy global-head to " " skip copy local-tail to end ]
and interprets it in a quite diffent a different way than rebol.
the 'copy for example.
Now
1) before execution Rebol does not know if parse is a dialect or function.
So it can not prepare something for it at "compile-time".
2) in this block we use two contexts, 'global-head is in one, 'local-tail in
another. So it does not help to pass the object in a hidden variable 'self,
like other OO-languages do.
3) Sometimes dialect-code is generated dynamically and often, and supersmart
compilers would be no real option.
For example in view-dialect for guis.
view layout[
ta: area 400x200 return
sl: slider [scroll-para ta sl ] 16x200
] ;(hope this works. not tested..
the [view layout] is interpreted as rebol-code, the following block
is interpreted by 'layout to create a gui.
'layout sets the words 'ta and 'sl. And it knows nothing about the context of
them, as with 'parse, they could be wildly mixed from different contexts.
And the [scroll-para ta sl ] is rebol-interpreted again. stored by 'layout
somewhere and called when slider moves.
So how knows this dialects where to look?
Because each word knows its context. simply [set 'word 15] and the right
context is choosen.
But what happens now?
class: context: [
v: none
m: func[s][parse s[copy v to end]
]
instance: make class []
now [instance/m "Hello"] calls parse. and parse sets 'v. fine.
- which 'v? the global? the one in 'm? or in 'class?
lets see, no 'self-variable used, the words are bound.
[class/m "Hello"] would set [class/v].
If we would simply share it, [instance/m "Hello"] would set
[class/v] to!! It would not know better.
To avoid that, rebol clones all functions and rebinds all words in the old
context to the new.
so [instance/m] is a clone of [class/m], but modified.
all [v] in it are now bound to [instance/v].
And now 'parse can do it right. :)
Is overhead, yes. But is also lots of comfort when used right.
And, after all, a lot of mind-bogling fun sometimes.
('layout is written in rebol btw, and if you want to see its source,
simply [source layout]. (Hmm, better don't, there are shorter examples of
dialects..) You get a lot of source in this "closed source"
interpreter that way. ;)
(btw2: because of rebols dynamics, it would be possible to generate the
wrapper-functions automatic to save some work. When generating more complex
code than that, binding suddenly becomes handy. ;)
> Unfortunately, DyBASE is object oriented database - i.e. it is
> oriented on work with objects. Object is assumed as set of fields with
> data and set of methods. Data is stored in DyBASE, methods - not (to
> make it possible to easily fix error in the program).
> Each object field has name and value. Objects with the same set of
> fields and methods belongs to one class. To minimize overhead, DyBASE
> create one class descriptor and all object instances, belonging to the
> class, contains reference to this class descriptor.
> Class descriptor contains class name and list of field names.
> Arrays and hashes are considered in DyBASE as values - so them are
> stored inside object and if N objects refer the same array or hash,
> N instances will be stored (and loaded next time when objects are
> accessed).
You can create the data-object and plug in the vtable later of course.
Instead of the current class-name you could store the vtable there.
And store the class-name in the vtable if you need it.
BTW, Carl mentioned, due to an internal sharing-trick objects are as good as
blocks, memory-wise. They share the names internally, only the data is per
instance.
> Rebol objects fits in this model (the only problem is that there are
> no classes, so I will have to make programmer to specify prototype
> object for the class). But if we provide persistency not for objects,
> but for blocks, then it is not clear to me how to store them.
> There will be no field names (but will be "hidden" fields, like
> current position name in variables referencing series).
>
> So it is possible to write normal serialization mechanism for Rebol
> (one that will preserve relations between objects), and even create
> database, which allows not only load/store, but also change/retrieve
> the data. But it could not be done within DyBASE object model.
> First task is significantly simpler, and if such module will be really
> interesting to many Rebol users, I can try to develop one.
>
> GI> Two ideas that have been mentioned are
>
> GI> 1) use block! values instead of objects to interface with DyBase
>
Carl says equal. Was surprised too :)
> GI> 2) Break the functions out of the objects into a supporting
> GI> context, or as free functions, so you don't have to create and
> GI> bind them with every object.
>
Hope my explanations makes sense..
> GI> Either one may require work, but could be worth investigating if
> GI> someone really needs the potential improvement.
>
> GI> I'm looking forward to trying the latest release.
>
> GI> Thanks again!
>
> GI> -- Gregg
HTH. And hopefully you get some fun with rebol in return for your work! :)
-Volker