Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Enhancement - parse-[scheme] word

 [1/7] from: brett:codeconscious at: 13-Feb-2001 12:16


Hi Pekr,
> I don't understand the topic of making parsers/serializers/tokenizers at
all
> yet, so I can be wrong here, but looking at 'import-email, 'xml-language,
etc.,
> - what about having some kind of parse rules available, grouped in related > areas? e.g. if some protocol is derived from XML, it will probably (in
theory)
> contain some rules which comply to parsing XML itself, no? :-)
My understanding of xml is that is a generic data format (w3c org has referred to it has an interchange syntax). That is it provides a foundation for other stuff. So yes, considering the attention xml is receiving having tools in Rebol to deal with it and the stuff that is built on it would be useful. Parse-xml is a simplistic way to interpret the xml at the most basic level. It provides a way to represent the basic xml structure using Rebol constructs. To deal with something that is derived from xml. You have to know the structure of the derivation. People have been using DTDs for this. So once you have read the DTD you go away and write a program (parser) that uses that knowledge to deal with information. A more sophisticated approach could be to write a program that reads the DTD and generates the parse rules for the original document automatically. Such parse rules might aim to read it from scratch or instead parse the results from parse-xml. I gather that there is a push to move away from DTDs and instead use schemas (xml based of course :) ) to represent the structures, apparently because dealing with DTDs is too complex for wide uptake.
> let's find some general ideas thru various protocols/parse-rules, of
course ...
> if they do exist ....
So yes there is stuff that can be done. Maybe the experienced XML hands on the list can point the way as to what Rebol tools should be written that give most bang-for-buck. Ultimately though once you've extracted your "information" out of these formats you have to process it. This is why I find it amazing that there seems to be so much hype about XML. XML doesn't solve an arbtrary problem it is merely aimed at getting through step 1 of computing easier - getting an in-memory representation. My AUD 0.05 worth. In Oz 2c was taken out of circulation years ago - strangely enough probably equals about USD 0.02 :) Brett.

 [2/7] from: al:bri:xtra at: 12-Feb-2001 16:53


What would really, really be nice would be parse-[scheme] words. For example: parse-pop parse-http parse-finger parse-whois which can take a URL or string and extract a well-formed scheme from the string or URL, and return that value. So: parse-http S: "http://www.pearl.com?" returns: http://www.pearl.com and leaves the input string at the first character that doesn't fit the scheme. So that: first S returns: #"?" Andrew Martin Really nice Rebolutionary... ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [3/7] from: al:bri:xtra at: 12-Feb-2001 17:29


Earlier, I wrote:
> What would really, really be nice would be parse-[scheme] words. For
example:
> parse-pop > parse-http > parse-finger > parse-whois > which can take a URL or string and extract a well-formed scheme from
the string or URL, and return that value. So:
> parse-http S: "http://www.pearl.com?" > returns: > http://www.pearl.com > and leaves the input string at the first character that doesn't fit
the scheme. So that:
> first S > returns: > #"?"
All these functions could reside in the scheme objects, one word and it's function per scheme. Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [4/7] from: robbo1mark:aol at: 12-Feb-2001 2:58


ANDREW, Garold wrote a URL parser for REBOL based on the RFC it is in the files section at the OSCAR website in theTOKENISER folder. I'm sure this would be a good starting basis for a script / functions which check individual 'scheme / protocol validity. cheers, Mark Dickson

 [5/7] from: petr:krenzelok:trz:cz at: 12-Feb-2001 9:20


Andrew Martin wrote:
> Earlier, I wrote: > > What would really, really be nice would be parse-[scheme] words. For
<<quoted lines omitted: 15>>
> All these functions could reside in the scheme objects, one word and it's > function per scheme.
Hi Andrew, maybe we could thought it out a little more deeply then. As I am looking at various places and e-commerce products, it seems to me XML is gaining more and more acceptance, especially in B2B area. Many protocols based upon XML are coming, just check: http://www.xml.org http://www.oasis-open.org http://www.w3c.org http://www.xmlrpc.com http://www.xmlsolutions.com or even: http://www.w3.org/2000/03/29-XML-protocol-matrix.html Where does it take its place in Rebol? We have currently xml-parser available. Maybe someone could enlighten me of how to succesfully use it :-) Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is there anything about them what could be generalized (kind of net-utils or root-protocol for parsed stuff?) Anyone? -pekr-

 [6/7] from: al:bri:xtra at: 12-Feb-2001 22:45


Petr wrote:
> Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is
there anything about them what could be generalized (kind of net-utils or root-protocol for parsed stuff?) XML as a pseudo scheme? xml://________________ Don't know what this would do. XML as a dialect of Rebol? p "Hello" br list ["one" "two" "three" now 1 + 2] Avoids having to write closing tags. Allows insertion of active Rebol content. Tag 'MyCustomTag ? Andrew Martin Simplistic Rebol... ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [7/7] from: petr:krenzelok:trz:cz at: 12-Feb-2001 11:08


Andrew Martin wrote:
> Petr wrote: > > Well, parse-email, parse-xml, parse-xml-rpc, parse-soap, etc, etc. ... is
<<quoted lines omitted: 3>>
> xml://________________ > Don't know what this would do.
don't know too ... but - do we need to have it available as scheme? Look at 'import-email for g.e. - it's not scheme too, and it contains several parse functions ... ->> source import-email import-email: func [ "Constructs an email object from an email message." data [string!] "The email message" /local content ][ data: parse-header system/standard/email copy/part data content: any [find/tail data "^/^/" tail data] data/date: parse-header-date data/date data/from: parse-email-addrs data/from data/to: parse-email-addrs data/to data/reply-to: parse-email-addrs data/reply-to data/content: content data ] ->> I don't understand the topic of making parsers/serializers/tokenizers at all yet, so I can be wrong here, but looking at 'import-email, 'xml-language, etc., - what about having some kind of parse rules available, grouped in related areas? e.g. if some protocol is derived from XML, it will probably (in theory) contain some rules which comply to parsing XML itself, no? :-) What's more - we really don't need to have it all available under schemes, as scheme is something else. Even your 'http scheme uses some kind of url parser. Maybe system/parser-rules could be created, but it doesn't matter now :-) First let's find some general ideas thru various protocols/parse-rules, of course ... if they do exist ....
> XML as a dialect of Rebol? > p "Hello" br list ["one" "two" "three" now 1 + 2] > Avoids having to write closing tags. Allows insertion of active Rebol > content. > Tag 'MyCustomTag > ? >
1) Isn't parsing stuff and working with particular protocol two slightly different things? 2) I don't understand your Tag 'MyCustomTag (my-custom-tag in a rebol way ;-)) Cheers, -pekr-

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted