r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

BrianH
12-Apr-2005
[6]
How is your DOM tree implemented? REBOL doesn't currently have very 
good XML support by default as such. People tend to use text, blocks, 
objects or a combination of them.
Chris
28-Oct-2005
[7x2]
From 'Tech News' -- Discussion on implementing the DOM (or something 
similar)...
Mozilla reference -- http://www.mozilla.org/docs/dom/
Pekr
28-Oct-2005
[9x2]
the best work on XML parser in REBOL so far, imo, is Gavain Mckenzie's 
script ....
dunno if it is on rebol.org or not ...
Chris
28-Oct-2005
[11x4]
http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=xml-parse.r
So it tries to conform to SAX (Simple API to XML) instead of the 
DOM...
I have to admit, I'm awed by the size -- is this the least that it 
will take to get a reasonable XML implementation in Rebol?  And how 
to manipulate and store a SAX structure?
http://www.saxproject.org/
Pekr
28-Oct-2005
[15x2]
what does it mean to be "awed"? :-)
maybe we could contact Gavain to cooperate with us. I once asked 
him and he told me something like that it does 80% of what normally 
ppl need ...
Chris
28-Oct-2005
[17]
The size would make me apprehensive about dropping it casually into 
a project.
Pekr
28-Oct-2005
[18x2]
but that is not the original script ... that was much smaller ...
and besides that - look at other XML libraries ... compress your 
script and the size is ok :-)
Chris
28-Oct-2005
[20]
Also, it appears not to work out the box...
Volker
28-Oct-2005
[21]
How to get started with xml? I know the simple things, kind of object-tree, 
similar to what parse-xml does. What extras would be needed?
Pekr
28-Oct-2005
[22x2]
Gavain's scripts did work for me, did not try above referred and 
updated version ...
Volker: user xml-parse+ instead of xml-parse ... you will receive 
block/object structure IIRC ...
Volker
28-Oct-2005
[24x2]
Maybe we could start with examples in xml and how they could look 
in rebol? with some dialect for the extras?
Since i can do parsing, but when i look at xml-docu, i do not know 
where to start. If someone could break that up for me..
Pekr
28-Oct-2005
[26]
dunno if much of an overhead, but maybe parse-xml could be used to 
parse even html?
Chris
28-Oct-2005
[27x2]
Volker, I'm not even sure that node objects need to be stored as 
a tree.
All access to them is through functions, such as -- element/append-child 
new-element
Volker
28-Oct-2005
[29x2]
Well, internally we have references. And externally we can use some 
dialect, 
  127 [name "tag-name" data "data"]
  [name "ref-name" reference 127
Are this functions kind of standard? Then we could start with some 
"protocol" implementing the functionallity.
Chris
28-Oct-2005
[31x3]
Yes the DOM is the standard.
From the Mozilla guide:
// document.getElementsByTagName("H1") returns a NodeList of the 
H1
// elements in the document, and the first is number 0:
var header = document.getElementsByTagName("H1").item(0);

// the firstChild of the header is a Text node, and the data
// property of the text node contains its text:
header.firstChild.data = "A dynamic document";
// now the header is "A dynamic document".

// Get the first P element in the document the same way:
var para = document.getElementsByTagName("P").item(0);
// and change its text too:
para.firstChild.data = "This is the first paragraph.";

// create a new Text node for the second paragraph

var newText = document.createTextNode("This is the second paragraph.");
// create a new Element to be the second paragraph
var newElement = document.createElement("P");
// put the text in the paragraph
newElement.appendChild(newText);

// and put the paragraph on the end of the document by appending 
it to
// the BODY (which is the parent of para)
para.parentNode.appendChild(newElement);
Volker
28-Oct-2005
[34]
So if we implement this api in rebol, we could use standard documentation? 
And browsers are based on DOM, we could map that to rebol-plugin 
and control browser?
Pekr
28-Oct-2005
[35]
it seems so ...
Chris
28-Oct-2005
[36]
; In REBOL?
header: first document/get-elements-by-tag-name <h1>
set in header/first-child 'data "A dynamic document"

para: first document/get-elements-by-tag-name <p>
set in header/first-child 'data "This is the first paragraph."


new-text: document/create-text-node "This is the second paragraph."
new-element: document/create-element <p>
new-element/append-child new-text

parent: para/parent-node
parent/append-child new-element
Pekr
28-Oct-2005
[37]
the zero based indexing might be a problem here, no? but who knows 
... such functions as "get-element-by-tag-name" etc. I do remember 
from Gabriele's Temple :-)
Volker
28-Oct-2005
[38]
Then we should start there. conversion to xml may than suddenly be 
simple.
Pekr
28-Oct-2005
[39]
So the DOM would be more usefull than having SAX? (if that is correctly 
laid question)
Volker
28-Oct-2005
[40]
AFAIK yes. SAX is efficient, dom simple to use.
Chris
28-Oct-2005
[41]
Well, perhaps DOM will be efficient in its Rebol incarnation :o)
Volker
28-Oct-2005
[42]
SAX is like parse. [a-tag another-tag (do-something) /a-tag]. DOM 
works like load does. AFAIK.
Pekr
28-Oct-2005
[43]
so what is the difference basically in when you parse XML document 
using SAX and using DOM?
Volker
28-Oct-2005
[44]
So with DOM you need all in memory, with SAX you can stream.
Pekr
28-Oct-2005
[45]
then SAX is better, no?
Volker
28-Oct-2005
[46x3]
As programmer not AFAIK. with a dom you can use path-notation. with 
SAX you build that tree yourself. I guess SAX makes sense when you 
convert data, like xml-make-doc. one tag, output something, another 
tag, output something other.
If its more like a block of records, it would be DOM. parse<->sax, 
load <-> DOM.
Thats what i understand from the overviews. Then comes how it works, 
and i am quickly back to real parse and load..
Pekr
28-Oct-2005
[49]
There are two major types of XML (or SGML) APIs:

Tree-based APIs

    These map an XML document into an internal tree structure, then allow 
    an application to navigate that tree. The Document Object Model (DOM) 
    working group at the World-Wide Web Consortium (W3C) maintains a 
    recommended tree-based API for XML and HTML documents, and there 
    are many such APIs from other sources. 
Event-based APIs

    An event-based API, on the other hand, reports parsing events (such 
    as the start and end of elements) directly to the application through 
    callbacks, and does not usually build an internal tree. The application 
    implements handlers to deal with the different events, much like 
    handling events in a graphical user interface. SAX is the best known 
    example of such an API.
Chris
28-Oct-2005
[50]
It should work -- XML -> DOM -> XML -- with the DOM being a document 
structure and a collection of methods for manipulating itself.
Pekr
28-Oct-2005
[51]
taken from: http://www.saxproject.org/event.html
Chris
28-Oct-2005
[52]
If the internal representation is an object-base tree, what are the 
barriers to the 'get-elements-by-tag-name function?
Volker
28-Oct-2005
[53]
Yes, load is our tree, parse our events. Think of parse as "Here 
comes the word 'file. Yuppa, and a real 'file! . Good, and a 'binary!. 
(fine, now i store that data in that file)"
Chris
28-Oct-2005
[54]
http://www.zvon.org/xxl/DOM2reference/Output/index.html
Pekr
28-Oct-2005
[55]
what would you find more usefull when working with XML? DOM sounds 
good when working with loaded document, all those find-element-by-name 
etc funcs sound usefull. For streaming kind of purposes (protocols), 
SAX sounds being a better option ...