[REBOL] Re: Sort by first part of line
From: greggirwin:mindspring at: 5-Sep-2002 12:52
Hi Louis,
<< What exactly does hash do... >>
Maybe Joel will chime in with a really good definition, but here's a quick
quote and my (probably poor) layman's explanation.
per Knuth, regarding searching algorithms "A third possibility is to avoid
all this rummaging around by doing some arithmetical calculation on K,
computing a function f(K) that is the location of K and the associated data
in the table."
Basically, and someone please correct me if I mis-state things, rather than
doing comparisons of items in a linear or tree structure, you generate a
location from the item/key and that's where the items lives. To find an
item, you just go to the location returned by your "hash function" for that
item.
Since you'll rarely get a totally unique set of results, the hashing process
also has to handle "collisions", where two keys generate the same result
when hashed.
Insertions are slower because more work is involved, but retrievals are very
fast.
<<..., and could hash be used to speed up sort? >>
I wish I could remember a thread from a while back the did some comparisons.
Normally, you don't think of a hash table as a sorted structure but REBOL is
so very cool, that you can just change the datatype from block! to list! to
hash! and try things out. Do some tests and see what happens with your
actual data. Lists are good if you're doing lots of insertions and
deletions, and hashes are good if you're doing a lot of lookups. Testing
will tell, though, how SORT interacts with the datatypes, your particular
data, and how you use it. All sorting algorithms are not created equal. :)
--Gregg