Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: building a dynamic path to elements in block

From: joel:neely:fedex at: 3-Nov-2000 6:50

HI, Gary, [rebol-bounce--rebol--com] wrote:
> ... > The error in the blocks are probably from my hasty cutting and > pasting - yes they should be strings.
Hmmm... See below
> As to what I'm doing, ... I've been learning Ancient Greek and > I came across the Perseus collection of texts (if you set it up > correctly you can get the texts in Greek with the pitch accents)
Could you pass on the URL? I know someone who might be interested.
> ... I noticed from errors and other clues in the html that they > were probably based on TEI.2 (Text encoding Initiative) xml or > sgml,
Ahhh... SGML has a far ...ummm... "richer" grammar than XML. If the source documents are really SGML, then all bets are off as to how Parse-XML is going to digest them, and what it's going to give you back as a result. Parse-XML really is a minimalist parser, and doesn't even handle HTML (except for XHTML) very well the way most web pages use it. If you'll send either the URL for a document of interest, or sent a sample of the document source, I'll be glad to take a peek and see if I can tell whether you've got SGML on your hands.
> and since I'm particularly interested in multi-lingual etext and > xml I thought I'd learn REBOL by writing some scripts to reverse > engineer the texts into something like the original xml/sgml, > and from there I could generate any number of layouts for the texts.
As an old text-formatting hacker, I think that's an excellent use of REBOL (though slightly ambitious as a learning project ;-). If you run into any quicksand, I'll be glad to try to throw you a rope.
> For later reference I was wondering whether there was any clever > way to handle the two to three character strings that UTF-8 uses > to encoded non-asci unicode. For the moment I can avoid the issue... > But I could envisage transcoding problems down the track.
I believe I say something that said Unicode support was a future enhancement planned for REBOL. That would be A Good Thing, and might give you another reason to put off tackling UTF-8 for now. Bona Fortuna! (ooops! wrong empire! ;-) -jn-