Enhancement - parse-[scheme] word
[1/7] from: brett:codeconscious at: 13-Feb-2001 12:16
Hi Pekr,
> I don't understand the topic of making parsers/serializers/tokenizers at
all
> yet, so I can be wrong here, but looking at 'import-email, 'xml-language,
etc.,
> - what about having some kind of parse rules available, grouped in related
> areas? e.g. if some protocol is derived from XML, it will probably (in
theory)
> contain some rules which comply to parsing XML itself, no? :-)
My understanding of xml is that is a generic data format (w3c org has
referred to it has an interchange syntax). That is it provides a foundation
for other stuff. So yes, considering the attention xml is receiving having
tools in Rebol to deal with it and the stuff that is built on it would be
useful. Parse-xml is a simplistic way to interpret the xml at the most
basic level. It provides a way to represent the basic xml structure using
Rebol constructs.
To deal with something that is derived from xml. You have to know the
structure of the derivation. People have been using DTDs for this. So once
you have read the DTD you go away and write a program (parser) that uses
that knowledge to deal with information. A more sophisticated approach could
be to write a program that reads the DTD and generates the parse rules for
the original document automatically. Such parse rules might aim to read it
from scratch or instead parse the results from parse-xml.
I gather that there is a push to move away from DTDs and instead use schemas
(xml based of course :) ) to represent the structures, apparently because
dealing with DTDs is too complex for wide uptake.
> let's find some general ideas thru various protocols/parse-rules, of
course ...
> if they do exist ....
So yes there is stuff that can be done. Maybe the experienced XML hands on
the list can point the way as to what Rebol tools should be written that
give most bang-for-buck.
Ultimately though once you've extracted your "information" out of these
formats you have to process it. This is why I find it amazing that there
seems to be so much hype about XML. XML doesn't solve an arbtrary problem it
is merely aimed at getting through step 1 of computing easier - getting an
in-memory representation.
My AUD 0.05 worth.
In Oz 2c was taken out of circulation years ago - strangely enough probably
equals about USD 0.02 :)
Brett.
[2/7] from: al:bri:xtra at: 12-Feb-2001 16:53
What would really, really be nice would be parse-[scheme] words. For
example:
parse-pop
parse-http
parse-finger
parse-whois
which can take a URL or string and extract a well-formed scheme from the
string or URL, and return that value. So:
parse-http S: "http://www.pearl.com?"
returns:
http://www.pearl.com
and leaves the input string at the first character that doesn't fit the
scheme. So that:
first S
returns:
#"?"
Andrew Martin
Really nice Rebolutionary...
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[3/7] from: al:bri:xtra at: 12-Feb-2001 17:29
Earlier, I wrote:
> What would really, really be nice would be parse-[scheme] words. For
example:
> parse-pop
> parse-http
> parse-finger
> parse-whois
> which can take a URL or string and extract a well-formed scheme from
the string or URL, and return that value. So:
> parse-http S: "http://www.pearl.com?"
> returns:
> http://www.pearl.com
> and leaves the input string at the first character that doesn't fit
the scheme. So that:
> first S
> returns:
> #"?"
All these functions could reside in the scheme objects, one word and it's
function per scheme.
Andrew Martin
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[4/7] from: robbo1mark:aol at: 12-Feb-2001 2:58
ANDREW,
Garold wrote a URL parser for REBOL based on the RFC it is in the files
section at the OSCAR website in theTOKENISER folder.
I'm sure this would be a good starting basis for a script / functions which
check individual 'scheme / protocol validity.
cheers,
Mark Dickson
[5/7] from: petr:krenzelok:trz:cz at: 12-Feb-2001 9:20
Andrew Martin wrote:
> Earlier, I wrote:
> > What would really, really be nice would be parse-[scheme] words. For
<<quoted lines omitted: 15>>
> All these functions could reside in the scheme objects, one word and it's
> function per scheme.
Hi Andrew,
maybe we could thought it out a little more deeply then. As I am looking at
various places and e-commerce products, it seems to me XML is gaining more and
more acceptance, especially in B2B area. Many protocols based upon XML are
coming, just check:
http://www.xml.org
http://www.oasis-open.org
http://www.w3c.org
http://www.xmlrpc.com
http://www.xmlsolutions.com
or even:
http://www.w3.org/2000/03/29-XML-protocol-matrix.html
Where does it take its place in Rebol? We have currently xml-parser available.
Maybe someone could enlighten me of how to succesfully use it :-)
Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is
there anything about them what could be generalized (kind of net-utils or
root-protocol for parsed stuff?)
Anyone?
-pekr-
[6/7] from: al:bri:xtra at: 12-Feb-2001 22:45
Petr wrote:
> Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is
there anything about them what could be generalized (kind of net-utils or
root-protocol for parsed stuff?)
XML as a pseudo scheme?
xml://________________
Don't know what this would do.
XML as a dialect of Rebol?
p "Hello" br list ["one" "two" "three" now 1 + 2]
Avoids having to write closing tags. Allows insertion of active Rebol
content.
Tag 'MyCustomTag
?
Andrew Martin
Simplistic Rebol...
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[7/7] from: petr:krenzelok:trz:cz at: 12-Feb-2001 11:08
Andrew Martin wrote:
> Petr wrote:
> > Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is
<<quoted lines omitted: 3>>
> xml://________________
> Don't know what this would do.
don't know too ... but - do we need to have it available as scheme? Look at
'import-email for g.e. - it's not scheme too, and it contains several parse
functions ...
->> source import-email
import-email: func [
"Constructs an email object from an email message."
data [string!] "The email message"
/local content
][
data: parse-header system/standard/email
copy/part data content: any [find/tail data "^/^/" tail data]
data/date: parse-header-date data/date
data/from: parse-email-addrs data/from
data/to: parse-email-addrs data/to
data/reply-to: parse-email-addrs data/reply-to
data/content: content
data
]
->>
I don't understand the topic of making parsers/serializers/tokenizers at all
yet, so I can be wrong here, but looking at 'import-email, 'xml-language, etc.,
- what about having some kind of parse rules available, grouped in related
areas? e.g. if some protocol is derived from XML, it will probably (in theory)
contain some rules which comply to parsing XML itself, no? :-)
What's more - we really don't need to have it all available under schemes, as
scheme is something else. Even your 'http scheme uses some kind of url parser.
Maybe system/parser-rules could be created, but it doesn't matter now :-) First
let's find some general ideas thru various protocols/parse-rules, of course ...
if they do exist ....
> XML as a dialect of Rebol?
> p "Hello" br list ["one" "two" "three" now 1 + 2]
> Avoids having to write closing tags. Allows insertion of active Rebol
> content.
> Tag 'MyCustomTag
> ?
>
1) Isn't parsing stuff and working with particular protocol two slightly
different things?
2) I don't understand your Tag 'MyCustomTag (my-custom-tag in a rebol way ;-))
Cheers,
-pekr-
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted