Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: Enhancement - valid [scheme]? words

From: jeff:rebol at: 14-Feb-2001 14:06

Howdy, Holger:
> For instance if the last character of a URL before > whitespace in an email is a "?" then this is most likely a > real question mark, not part of the URL, regardless of > whether a "?" at the end of a URL is valid. Same thing for > ",", "." etc.
Hey, go check out this web site: It's great. Also, this one is neat, too:,
> That kind of determination is heuristic in nature though, > and cannot be derived from URL grammar rules. The exact set > of heuristics would depend on the context, e.g. the language > of the surrounding text. This can actually get very > complicated. Look at the two examples
I don't know that a heuristic approach would ever be really adequate, but you really need a full natural language grammar to make solid distinctions.
> "Have a look at for a great time." > > "Have you seen Looks cool." > > In the first case the "?" appears to be part of the URL, in > the second it does not.
You can detect the differences in the above two sentences because looking at the first sentance, a decent natural language grammar won't allow the the second PP as a complete sentence (but will recognize "have" as a main verb and thus complete the VP with the PP), where as with the second sentence, the grammar will recognize "Have" as an auxiliary for "seen" and make a match (using a gap and fill scheme, for example) based on the fact that this is a wh-question equivalent for its declarative form (You have seen and therefore it will correctly determine is the end of the sentence and the question mark is the sentence terminator. Which is to say, as you said, that it can get quite complicated, but it is also to say that heuristics may not be sufficient for a lot of cases. :-) -jeff