[REBOL] Re: building a dynamic path to elements in block
From: joel:neely:fedex at: 3-Nov-2000 6:50
HI, Gary,
[rebol-bounce--rebol--com] wrote:
> ...
> The error in the blocks are probably from my hasty cutting and
> pasting - yes they should be strings.
Hmmm... See below
> As to what I'm doing, ... I've been learning Ancient Greek and
> I came across the Perseus collection of texts (if you set it up
> correctly you can get the texts in Greek with the pitch accents)
Could you pass on the URL? I know someone who might be interested.
> ... I noticed from errors and other clues in the html that they
> were probably based on TEI.2 (Text encoding Initiative) xml or
> sgml,
Ahhh... SGML has a far ...ummm... "richer" grammar than XML. If
the source documents are really SGML, then all bets are off as to
how Parse-XML is going to digest them, and what it's going to give
you back as a result.
Parse-XML really is a minimalist parser, and doesn't even handle
HTML (except for XHTML) very well the way most web pages use it.
If you'll send either the URL for a document of interest, or sent
a sample of the document source, I'll be glad to take a peek and
see if I can tell whether you've got SGML on your hands.
> and since I'm particularly interested in multi-lingual etext and
> xml I thought I'd learn REBOL by writing some scripts to reverse
> engineer the texts into something like the original xml/sgml,
> and from there I could generate any number of layouts for the texts.
As an old text-formatting hacker, I think that's an excellent use of
REBOL (though slightly ambitious as a learning project ;-). If you
run into any quicksand, I'll be glad to try to throw you a rope.
> For later reference I was wondering whether there was any clever
> way to handle the two to three character strings that UTF-8 uses
> to encoded non-asci unicode. For the moment I can avoid the issue...
> But I could envisage transcoding problems down the track.
I believe I say something that said Unicode support was a future
enhancement planned for REBOL. That would be A Good Thing, and
might give you another reason to put off tackling UTF-8 for now.
Bona Fortuna! (ooops! wrong empire! ;-)
-jn-