[REBOL] Re: Autogenerating TOC
From: brett:codeconscious at: 18-Nov-2000 2:35
Hi Sharriff!
> I´m trying to code a function that would automatically
> generate a TOC of of files in a directory
...
> and every alphabet is a link to a sublist HTML file that contains links to
> files starting with the alphabet displayed in the main TOC HTML file
...
> but I dont know how to connect both code snippets together, I know this
> code looks weird, If I can get the principles right I´ll try to clean it
up
> somehow.
I want to put forward some ideas I use all the time with any programming I
do. Here's a good one - clean it up first and the solution to making it work
will probably become obvious. If it doesn't, you will at least have an idea
of where to look.
Structuring your code through functions can help tremendously. If you make
each function responsible for one task and give it a meaningful name your
program becomes so much more readable. One consequence of doing this is that
you come up with alternative ways to structure your programs or other
opportunites to make it better - the success feeds off itself. The obvious
other benefit to this of course that you can parameterise your code so that
it becomes useful in other situations.
So right now, you have a script that is close to what you want but is hard
to change and it doesn't quite work. You need to methodically go through and
make it better. Doing this will very likely make it obvious to you how to
make you code work whre it isn't now.
[BTW, I discovered a fancy word for this last year - "refactoring". Somebody
coined the term and related it to the Object Oriented field. Then people
could write and sell books on it. Many programmers have been doing this as a
matter of course for years. ....Oh well, I thought it was interesting.]
So...
1) First Isolate a basic function of your script, put brackets around it,
give it a function name and call it using the new function name from the
part where you would have cut it out from.
Looking at your code maybe one such function could be GET-TOC-LIST.
2) Look through this function and try to define as many of the variables
that you find as local.
3) Then identify the parameter variables.
4) The other variable are likely to be globals - can they be economically
passed in as parameters using another form (object?) or should you redefine
your function a bit?. Or maybe they just want to be global. Make sure your
call to your function reflects your new parameter variables (or
refinements).
So now the function might be GET-TOC-LIST: function [directory
Directory to process
][...
5) Repeat for more functions
Maybe another function is
MAKE-TOC-HTML toc-list
(that will return a string of html code)
Think now of the consequences of your function choices. I choose
GET-TOC-LIST with a parameter of a directory to process. Great! This allows
me to do the same thing for lots of directories. What about if I changed the
definition of the function a little and added a parameter
first-filename-character. With this I could process all the files in a
directory that start with the letter "X" any time I liked (eg. after I know
I just added a XANADU file).
Another general point I'd like to make is that mixing in two different
types of stuff
together can be confusing. In this case an algorithm for
generating information from directories, and HTML. My suggestion of a
function MAKE-TOC-HTML confines the HTML stuff to one part of the script
which clears the fog around the algorithm parts of the script. Other ways to
do this might be to use objects or other scripts.
After doing this you might decide that some of the original script just
doesn't fit the new scheme of things. What is it there for? Does it work? Is
it redundant? How can I get rid of it or legitimise it in the new script?
As you gain experience with this sort of thing you end up visualising the
consequences of choosing particular function tasks and parameters over other
alternatives. Also, this process may stimulate ideas for radically
different alternatives in approaching the problem.
Ok, adding a second parameter that is a character seems good but this has
implications of how toc-list is stored and how MAKE-TOC-HTML would work.
This brings us to structuring of the data.
Ok a change of voice - my thoughts on the problem as I write.
How will toc-list hold its information? In your solution you have used
multiple container to deal with this. You also have read the directory
again - not a sin - but hey you've read it once already, you should be able
to collect a lot of the info you need this first time.
Here are some alternatives.
1) toc-list hold all the files for one character as a block
Well this is straight forward, but it probably means I need a second
parameter to MAKE-TOC-HTML to
tell that function what to put in as a heading.
2) toc-list holds all the first-characters and the related files in some
sort of block structure
This is more complex, but I don't need to have the second parameter to
MAKE-TOC-HTML because
all the information is already in the block.
But it means that MAKE-TOC-HTML will generate multiple files - oh oh
what I am going to call them? Should
the decision about what to call them reside inside MAKE-TOC-HTML?
I'm not really happy yet. Meditating on my disquiet about how much these
functions know about the overall task of making TOCs reminds me that the
objective was a main table-of-contents pointing to other table-of-contents
that finally point to the files. MAKE-TOC-HTML could be able to produce both
of these types of files. All it needs is the relevent information. What is
relevent to MAKE-TOC-HTML - surely just a heading and a pile of link
information!
Ok putting together some of the bits I've mentioned I come up with this
combo
toc-script might be structured like
[
"A" [%apples.html %asparagus.html %almond.html]
"C" [%carrots.html %candy.html %cute.html]
...etc...
]
So my processing script would be like (untested)
; Get the directory
directory-to-process: dirize ask "What directory?"
; Get the TOC data
toc-list: GET-TOC-LIST directory-to-process
;Generate the letter TOCs (and collect the file names for later)
toc-file-list: make block! (divide length? toc-list 2)
foreach [toc-heading toc-links] toc-list [
toc-file: join to-file toc-heading %.html
append toc-file-list toc-file ; Save the name for later.
write (join directory-to-process toc-file) (MAKE-TOC-HTML
toc-heading toc-links)
]
; Generate the main TOC
write (join directory-to-process %toc.html) (MAKE-TOC-HTML "Alphabetical
TOC" toc-file-list
print "Whohoo finished!"
Ok so I leave it to you now to decide if you want to go this route. If you
do I suspect you will have not have much trouble with MAKE-TOC-HTML.
For GET-TOC-LIST here a couple of tips
1) append/only result-block reduce [current-character file-list-block]
2) You don't necessarily need to to-string a file. For example, try this
first first read %.
Ok the code now maybe be bigger than your original script, but look what you
get. You can process any directory and the logic of the script almost jumps
out at you when you read it. Also you get the chance now that you might like
to add some extra capabilities using some refinements on the functions.
Second last point ;) When you make changes to a working script that you
want to stay working, implement only one change at a time across the script
and test it. This incremental way of changing your script is the safest way
to "refactor".
Last point. One of the "refactorings" with the biggest payback is to change
variable names to something meaningful. Just making this one change across
scripts can be extremely illuminating. The script starts to tell you about
itself. [It's ok, I'm not crazy yet ;)]
Just some simple ideas, but with big payoffs. I hope they are of value to
you.
Brett.