Overhead for insert and blocks
[1/7] from: chris:starforge:demon at: 30-Oct-2000 15:36
Hi,
I'm developing a package of scripts that store a
large number (potentially in the thousands) of usernames,
passwords, email addresses and other details. Insertions
and removals will be uncommon, but updates and searches
will be very common and they must be fast. For this reason
I've decided to split the data into 26 files and the file
any particular user will be in is determined by the first
character of the username. Internally the files look like:
[
[ username
password
email
......
]
[ username
... etc...
]
.....
]
I chose this structure to make accessing, adding and
removing records easier for me. The only problem I have
is that I can't use append - new items must be inserted
at the start of the list for reasons that would take too
long to go into now.
What is the overhead of inserting new user blocks at the
start or how are blocks stored? Do blocks containing
nested blocks contain pointers to the nested data, or is
the data arranged in a single continuos memory area?
Chris
--
New sig in the works
Explorer2260 Designer and Coder
http://www.starforge.co.uk
[2/7] from: joel:neely:fedex at: 30-Oct-2000 10:14
Hi, Chris,
[cpage--starforge--demon--co--uk] wrote:
...
> I chose this structure to make accessing, adding and
> removing records easier for me. The only problem I have
> is that I can't use append - new items must be inserted
> at the start of the list for reasons that would take too
> long to go into now.
>
In that case (without going into "too long" issues), how
about trying either list! storage or (it you can separate
the "build" from "use" phases of the blocks)...
typical-block: make block! big-enough-number
...
insert/only tail typical-block new-sub-block
...
Then, after the blocks are built...
typical-block: reverse typical-block
...
...use typical-block, which is now in chronological order...
-jn-
--
; Joel Neely [joel--neely--fedex--com] 901-263-4460 38017/HKA/9677
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]
[3/7] from: al:bri:xtra at: 31-Oct-2000 21:39
> Internally the files look like:
> [
<<quoted lines omitted: 11>>
> ...is the data arranged in a single continuos memory area?
>> User-Z: [
[ [
[ Zachary "Xyz" [Zach--z--com]
[ ]
[ [
[ Zebediah "ABC" [Zeb--zebulon--co--za]
[ ]
[ [
[ Zephyr
[ "Zone"
[ [Wind--Storm--com]
[ ]
[ ]
== [
[
Zachary "Xyz" [Zach--z--com]
]
[
Zebediah "ABC" [Zeb--zebulon--co--za]
]
[
Zephyr
...
>> print mold User-Z
[
[
Zachary "Xyz" [Zach--z--com]
]
[
Zebediah "ABC" [Zeb--zebulon--co--za]
]
[
Zephyr
"Zone"
[Wind--Storm--com]
]
]
Watch carefully...
Andrew Martin
I stand betwixt heaven and hell...
ICQ: 26227169
http://members.nbci.com/AndrewMartin/
[4/7] from: chris:starforge:demon at: 31-Oct-2000 10:22
Andrew Martin wrote:
> > Internally the files look like:
<snip>
> > ...how are blocks stored?
>
> > ...is the data arranged in a single continuos memory area?
>
<snip>
>
> Watch carefully...
But that doesn't actually answer my question. I'm not talking
about the representation the user sees, but how the data is
actually held in memory. Is it a simple memory block, single
or double linked list, how are nested blocks handled: are
they interleaved or does the parent block contain pointers
to the memory containing the nested blocks? What exactly happens
when a block is inserted into a block with /only?
These implementation issues can have a huge effect on the
speed of the script. I guess I shouldn't really be worrying
about such things but I'm an unrepentant C programmer and
habits are hard to break ;)
Chris
--
New sig in the works
Explorer2260 Designer and Coder
http://www.starforge.co.uk
[5/7] from: g:santilli:tiscalinet:it at: 31-Oct-2000 12:28
Chris wrote:
> But that doesn't actually answer my question. I'm not talking
> about the representation the user sees, but how the data is
> actually held in memory. Is it a simple memory block, single
[...]
A block like:
[
[ "one" "two" "three" ]
[ "four" "five" "six" ]
]
is in memory something like:
[ * * ]
| |
V V
[ * * * ] [ * * * ]
| | | | | |
V V V V V V
"one" "two" "three" "four" "five" "six"
Inserting at the head means shifting all the pointers, but not all
the data.
HTH,
Gabriele.
--
Gabriele Santilli <[giesse--writeme--com]> - Amigan - REBOL programmer
Amiga Group Italia sez. L'Aquila -- http://www.amyresource.it/AGI/
[6/7] from: chris::starforge::demon::co::uk at: 31-Oct-2000 11:37
Gabriele Santilli wrote:
> Inserting at the head means shifting all the pointers, but not all
> the data.
>
> HTH,
Just what I was looking for :) Thanks
Chris
--
New sig in the works
Explorer2260 Designer and Coder
http://www.starforge.co.uk
[7/7] from: al:bri:xtra at: 1-Nov-2000 8:13
Chris wrote:
> Andrew Martin wrote:
> > > Internally the files look like:
<<quoted lines omitted: 3>>
> But that doesn't actually answer my question. I'm not talking about the representation
the user sees, but how the data is actually held in memory. Is it a simple memory block,
single or double linked list, how are nested blocks handled: are they interleaved or
does the parent block contain pointers to the memory containing the nested blocks? What
exactly happens when a block is inserted into a block with /only?
> These implementation issues can have a huge effect on the speed of the script. I guess
I shouldn't really be worrying about such things but I'm an unrepentant C programmer
and habits are hard to break ;)
Note that the lines remain for the most part as I typed them in. This indicates that
Rebol stores the blocks as _text_ in a linear string after a little massaging.
It's only the list! data-type that's a bit different, but not by much.
And, as always with benchmarks, the best way is to test the speed yourself on your chosen
hardware.
Andrew Martin
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted