World: r3wp
[Core] Discuss core issues
older newer | first last |
BrianH 24-Mar-2009 [13063] | It is faster to allocate in chunks because you don't have to reallocate as often. |
[unknown: 5] 24-Mar-2009 [13064] | I suppose Carl has something more being allocated then just the string data. |
BrianH 24-Mar-2009 [13065] | (bbl) |
[unknown: 5] 24-Mar-2009 [13066] | So I'm wondering where the length is stored at. I'm wondering if it is stored preceding the string data. |
Steeve 24-Mar-2009 [13067x2] | no, it's stored in another one slot, anywher in the mmemory |
by default, an empty string allocate 3 slots. 1 storing the logical address 1 storing the physical address and the length 1 storing storing the real data . These slots can be stored at any place | |
[unknown: 5] 24-Mar-2009 [13069] | interesting. But doesn't seem very efficient. |
Steeve 24-Mar-2009 [13070] | when a string is expanded, the data can be moved in another one place. So the physical address slot is updated, not the logical one |
[unknown: 5] 24-Mar-2009 [13071] | Well this is where I have the concern. Because at a lower level you would want to allocate memory for string large enough for certain variations of the string during runtime. And that approach seems to negate that possibility. |
Steeve 24-Mar-2009 [13072] | each reference on the same strings, have his own logical slot pointing on the same physical slot |
[unknown: 5] 24-Mar-2009 [13073x2] | Not negate it necessarily but make it less efficient asyou would have to allocate storage for the string each time on the new size. |
I'm obviously talking low level here and not what we have ability to do via REBOL. | |
Steeve 24-Mar-2009 [13075] | So basically, the string " " uses (16 * 3) = 48 bytes the char #" " uses 16 bytes Do your choice :-) |
[unknown: 5] 24-Mar-2009 [13076x2] | That seems crazy to me. |
I guess Carl has his reasons. | |
Maxim 24-Mar-2009 [13078x2] | actually all series store these pointers no? |
paul, rebol does mutable series. | |
Steeve 24-Mar-2009 [13080x2] | yest |
*yes | |
Maxim 24-Mar-2009 [13082] | there is no other way... you have to know the bounds, and allow ram to be recycled. indirection is the only way to do this. rebol has its own memory manager. |
Steeve 24-Mar-2009 [13083] | but new references on the same serie only consume a new slot of 16 bytes |
[unknown: 5] 24-Mar-2009 [13084x2] | Maxim, there are other ways. |
Maybe not in REBOL but there are in other languages for example. | |
Maxim 24-Mar-2009 [13086x3] | other languages do not use mutable series. they use immutable strings within a single index. every operation is a memcopy and then replace the pointer. |
and all references to a string are actually independant... change the string in varA and varB doesn't reflect it. | |
in rebol, they really are the same actuall string object. | |
[unknown: 5] 24-Mar-2009 [13089] | I'll have to learn more about mutable series verse immutable series. |
Maxim 24-Mar-2009 [13090x2] | python string objects, are comparable to rebol's series, but use immutable string internally, to be compatible with C. |
mutables actually make rebol harder to interface to most external code, cause we don't use the normal string end concept of terminating with a 0 char. | |
[unknown: 5] 24-Mar-2009 [13092] | So what do you mean when you say "mutable"? |
Maxim 24-Mar-2009 [13093] | rebol changes the actual bytes within the ram. most languages, create new strings and assign the new pointer to the variable. |
[unknown: 5] 24-Mar-2009 [13094] | Ok, so that means mutable? |
Maxim 24-Mar-2009 [13095] | yep... the ram mutates, "in-place". |
[unknown: 5] 24-Mar-2009 [13096] | Got ya. That is easy enough to understand. |
Henrik 24-Mar-2009 [13097] | wasn't that also the difference between R1 and R2? |
[unknown: 5] 24-Mar-2009 [13098] | But if that is the case then if the string changes such that it doesn't fit the size of the existing allocation , then what happens? |
Maxim 24-Mar-2009 [13099] | then rebol reallocates a new region of ram and copies the current data into it, adding a few extra bytes based on heuristics, so that small changes don't need to constantly re-allocate ram. |
[unknown: 5] 24-Mar-2009 [13100] | So do you believe that REBOL is using the Pascal like length-prefixed strings? |
Maxim 24-Mar-2009 [13101] | which is why you should do : s: make string! 10003 when you know that your algorythm will eventually reach 10000 bytes |
[unknown: 5] 24-Mar-2009 [13102] | Correct, that is how I see it and why I ask about this. Because to me this makes more sense as to assigning the length before hand. |
Maxim 24-Mar-2009 [13103x3] | no since strings are objects, just like all datatypes. they have internal counters for offset, length, etc. the string itself really is just a buffer. which is why in R2 strings and binary really are the same thing. in R3 this is quite different. |
the binary and character lengths of strings aren't the same thing, depending on the encoding of strings. | |
(in R3) | |
Steeve 24-Mar-2009 [13106x2] | it's why as-binary or as-string can't be no more exisit in R3 |
in R2, it's really fast because only the type of the value is changed (no boring things like copy are done) | |
[unknown: 5] 24-Mar-2009 [13108x5] | See, I see strings stored in memory as nothing more than a character array. |
The actual data part rather. | |
HLA is a language that stores string data a bit differently it sounds. It allows both null termination but allows null use in string also. | |
It got me thinking as to how REBOL does its length handling. | |
In HLA, the string is prefixed with a dword value indicating max-length, then a dword value indicating current length, then the string characters and then a null termination. | |
older newer | first last |