Reading directories faster.

[1/7] from: reboler::programmer::net at: 8-Oct-2002 14:50

I am using the code below to read and sort the current directory contents. It is _terribly_ slow on large directories (> 1000 files/directories) either none? df: attempt [read %./][ "Error reading Directory" ][ foreach file df [insert tail either (dir? file)[dirs][files] file] ] Any ideas on how to speed it up? -- __________________________________________________________ Sign-up for your own FREE Personalized E-mail at Mail.com http://www.mail.com/?sr=signup Free price comparison tool gives you the best prices and cash back! http://www.bestbuyfinder.com/download.htm

[2/7] from: petr:krenzelok:trz:cz at: 8-Oct-2002 22:21

alan parman wrote:

>I am using the code below to read and sort the current directory contents. >It is _terribly_ slow on large directories (> 1000 files/directories)

<<quoted lines omitted: 3>>

> foreach file df [insert tail either (dir? file)[dirs][files] file] >]

I am not sure you can. some time ago I requested RT to add ability to load/dirs load/files refinements or something like that, but even ppl here argued rebol is fast enough, which is not true on larger and especially network drives - quite unusable. But it was some two years ago. Maybe you could look at new native called 'remove-each? For further description go here: http://www.reboltech.com/downloads/changes.html#sect4.4. There is quite the example you are looking for - but if you want to get list of files and directories - you will have to copy original block, and then run it two times - once for files, second time for dirs .. Let us know if the result is faster now ... -pekr-

[3/7] from: greggirwin:mindspring at: 8-Oct-2002 14:27

Hi Alan, << I am using the code below to read and sort the current directory contents. It is _terribly_ slow on large directories (> 1000 files/directories) either none? df: attempt [read %./][ "Error reading Directory" ][ foreach file df [insert tail either (dir? file)[dirs][files] file] ]

DIR? does a bit of work to find out if a file has the directory flag set for it (look at the source for it). Since REBOL is very consistent about how it forms directory names, you can cheat a bit and avoid the extra disk hits for every file by doing something like this: either none? df: attempt [read %.][ "Error reading Directory" ][ foreach file df [ insert tail either (#"/" = last file)[dirs][files] file ] ] --Gregg

[4/7] from: brett:codeconscious at: 9-Oct-2002 9:31

Hi Alan,

> either none? df: attempt [read %./][ > "Error reading Directory" > ][ > foreach file df [insert tail either (dir? file)[dirs][files] file] > ] > > Any ideas on how to speed it up?

The dir? function is based on the info? function. info? queries the target system to get the attributes of the target. But the results of read %./ already contains enough information about whether an item is a file or a directory - just look for a slash #"/" at the end of the name. So by using dir? you end up making an unnecessary call to the file system for every file. So try changing the code to: either none? df: attempt [read %./][ "Error reading Directory" ][ foreach file df [insert tail either #"/" = last file [dirs][files] file] ] On my system your code took 9 - 10 seconds. After the change, approximately 0.05 of a second. Also, don't forgot to preallocate the dirs and the files block! or list! or whatever you are using to be a "reasonable size". For example, files: make block! 2000 This avoids a lot of the time taken to automatically expand the block as it is being filled. Regards, Brett.

[5/7] from: atruter:hih:au at: 9-Oct-2002 9:32

> I am using the code below to read and sort the current directory

contents. If you only need files or dirs the following may help: read-dir: func [path /files /dirs] [ either dirs [ sort remove-each dir read path [#"/" <> last dir] ][ sort remove-each dir read path [#"/" = last dir] ] ] Regards, Ashley

[6/7] from: atruter:hih:au at: 9-Oct-2002 10:03

> Any ideas on how to speed it up?

Alternatively, how about the following from left field? (not tested ;) ) sort/compare read %. func [a b] [ either (last a) = (last b) [ a < b ][ (last a) < (last b) ] ] Regards, Ashley

[7/7] from: reboler:programmer at: 10-Oct-2002 12:41

Re: Reading directories faster Thanks all! It _was_ the 'dir? portion that was slowing things down. 'dir? uses 'info?, which uses 'query, so, as mentioned, there was a disk read on every file. I am currently using ... either none? df: attempt [read %./][ tell directory "Error reading Directory" ][ forall df [insert either #"/" = last df/1 [dirs][files] df/1] ; either dir? df/1 ; old method ] This is supremely superior to the old 'dir? method! I do pre-set the size of the dirs & files blocks, but before I go to a new directory, I 'clear them rather than resetting them ("clear dirs" instead of "dirs: make block! 16"). Then they do have to grow dynamically, but only when I go to a larger directory. This way they never grow larger than needed for the largest directory I visit - no wasted memory allocation. Ashley, your "left field" sort works fine. It puts the directories at the head of the output, but I need to have them separated from the files. This is also a neat way to group files by extension! probe sort/compare read %. func [a b] [ either (last a) = (last b) [ a < b ][ (last a) < (last b) ] ] -- __________________________________________________________ Sign-up for your own FREE Personalized E-mail at Mail.com http://www.mail.com/?sr=signup Free price comparison tool gives you the best prices and cash back! http://www.bestbuyfinder.com/download.htm

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted