World: r3wp

Join the discussions in the REBOL3 world...

[XML] xml related conversations

older newer	first last
Pekr 28-Oct-2005 [18x2]	but that is not the original script ... that was much smaller ...
Pekr 28-Oct-2005 [18x2]	and besides that - look at other XML libraries ... compress your script and the size is ok :-)
Chris 28-Oct-2005 [20]	Also, it appears not to work out the box...
Volker 28-Oct-2005 [21]	How to get started with xml? I know the simple things, kind of object-tree, similar to what parse-xml does. What extras would be needed?
Pekr 28-Oct-2005 [22x2]	Gavain's scripts did work for me, did not try above referred and updated version ...
Pekr 28-Oct-2005 [22x2]	Volker: user xml-parse+ instead of xml-parse ... you will receive block/object structure IIRC ...
Volker 28-Oct-2005 [24x2]	Maybe we could start with examples in xml and how they could look in rebol? with some dialect for the extras?
Volker 28-Oct-2005 [24x2]	Since i can do parsing, but when i look at xml-docu, i do not know where to start. If someone could break that up for me..
Pekr 28-Oct-2005 [26]	dunno if much of an overhead, but maybe parse-xml could be used to parse even html?
Chris 28-Oct-2005 [27x2]	Volker, I'm not even sure that node objects need to be stored as a tree.
Chris 28-Oct-2005 [27x2]	All access to them is through functions, such as -- element/append-child new-element
Volker 28-Oct-2005 [29x2]	Well, internally we have references. And externally we can use some dialect, 127 [name "tag-name" data "data"] [name "ref-name" reference 127
Volker 28-Oct-2005 [29x2]	Are this functions kind of standard? Then we could start with some "protocol" implementing the functionallity.
Chris 28-Oct-2005 [31x3]	Yes the DOM is the standard.
	From the Mozilla guide:
	// document.getElementsByTagName("H1") returns a NodeList of the H1 // elements in the document, and the first is number 0: var header = document.getElementsByTagName("H1").item(0); // the firstChild of the header is a Text node, and the data // property of the text node contains its text: header.firstChild.data = "A dynamic document"; // now the header is "A dynamic document". // Get the first P element in the document the same way: var para = document.getElementsByTagName("P").item(0); // and change its text too: para.firstChild.data = "This is the first paragraph."; // create a new Text node for the second paragraph var newText = document.createTextNode("This is the second paragraph."); // create a new Element to be the second paragraph var newElement = document.createElement("P"); // put the text in the paragraph newElement.appendChild(newText); // and put the paragraph on the end of the document by appending it to // the BODY (which is the parent of para) para.parentNode.appendChild(newElement);
Volker 28-Oct-2005 [34]	So if we implement this api in rebol, we could use standard documentation? And browsers are based on DOM, we could map that to rebol-plugin and control browser?
Pekr 28-Oct-2005 [35]	it seems so ...
Chris 28-Oct-2005 [36]	; In REBOL? header: first document/get-elements-by-tag-name <h1> set in header/first-child 'data "A dynamic document" para: first document/get-elements-by-tag-name <p> set in header/first-child 'data "This is the first paragraph." new-text: document/create-text-node "This is the second paragraph." new-element: document/create-element <p> new-element/append-child new-text parent: para/parent-node parent/append-child new-element
Pekr 28-Oct-2005 [37]	the zero based indexing might be a problem here, no? but who knows ... such functions as "get-element-by-tag-name" etc. I do remember from Gabriele's Temple :-)
Volker 28-Oct-2005 [38]	Then we should start there. conversion to xml may than suddenly be simple.
Pekr 28-Oct-2005 [39]	So the DOM would be more usefull than having SAX? (if that is correctly laid question)
Volker 28-Oct-2005 [40]	AFAIK yes. SAX is efficient, dom simple to use.
Chris 28-Oct-2005 [41]	Well, perhaps DOM will be efficient in its Rebol incarnation :o)
Volker 28-Oct-2005 [42]	SAX is like parse. [a-tag another-tag (do-something) /a-tag]. DOM works like load does. AFAIK.
Pekr 28-Oct-2005 [43]	so what is the difference basically in when you parse XML document using SAX and using DOM?
Volker 28-Oct-2005 [44]	So with DOM you need all in memory, with SAX you can stream.
Pekr 28-Oct-2005 [45]	then SAX is better, no?
Volker 28-Oct-2005 [46x3]	As programmer not AFAIK. with a dom you can use path-notation. with SAX you build that tree yourself. I guess SAX makes sense when you convert data, like xml-make-doc. one tag, output something, another tag, output something other.
	If its more like a block of records, it would be DOM. parse<->sax, load <-> DOM.
	Thats what i understand from the overviews. Then comes how it works, and i am quickly back to real parse and load..
Pekr 28-Oct-2005 [49]	There are two major types of XML (or SGML) APIs: Tree-based APIs These map an XML document into an internal tree structure, then allow an application to navigate that tree. The Document Object Model (DOM) working group at the World-Wide Web Consortium (W3C) maintains a recommended tree-based API for XML and HTML documents, and there are many such APIs from other sources. Event-based APIs An event-based API, on the other hand, reports parsing events (such as the start and end of elements) directly to the application through callbacks, and does not usually build an internal tree. The application implements handlers to deal with the different events, much like handling events in a graphical user interface. SAX is the best known example of such an API.
Chris 28-Oct-2005 [50]	It should work -- XML -> DOM -> XML -- with the DOM being a document structure and a collection of methods for manipulating itself.
Pekr 28-Oct-2005 [51]	taken from: http://www.saxproject.org/event.html
Chris 28-Oct-2005 [52]	If the internal representation is an object-base tree, what are the barriers to the 'get-elements-by-tag-name function?
Volker 28-Oct-2005 [53]	Yes, load is our tree, parse our events. Think of parse as "Here comes the word 'file. Yuppa, and a real 'file! . Good, and a 'binary!. (fine, now i store that data in that file)"
Chris 28-Oct-2005 [54]	http://www.zvon.org/xxl/DOM2reference/Output/index.html
Pekr 28-Oct-2005 [55]	what would you find more usefull when working with XML? DOM sounds good when working with loaded document, all those find-element-by-name etc funcs sound usefull. For streaming kind of purposes (protocols), SAX sounds being a better option ...
Volker 28-Oct-2005 [56]	Yes, that what i understand too.
Chris 28-Oct-2005 [57]	What are the SAX methods for manipulating an XML document, and how easy is it to save the changes?
Pekr 28-Oct-2005 [58]	Chris - following is true imo which favors SAX with me: Tree-based APIs are useful for a wide range of applications, but they normally put a great strain on system resources, especially if the document is large. Furthermore, many applications need to build their own strongly typed data structures rather than using a generic tree corresponding to an XML document. It is inefficient to build a tree of parse nodes, only to map it onto a new data structure and then discard the original.
Chris 28-Oct-2005 [59]	You've lost me...
Pekr 28-Oct-2005 [60x2]	The thing is - result of DOM parsing is tree representation of document - in Rebol .... the question is, what if you need data organised otherwise? You will have to search that tree and build such structure which fits you anyway ....
Pekr 28-Oct-2005 [60x2]	yes, saving the changes - will have to think about it. .... it might be tricky, if even possible :-)
Chris 28-Oct-2005 [62]	Which is the point in my suggesting the DOM :o)
Pekr 28-Oct-2005 [63x2]	so we should have both :-)
Pekr 28-Oct-2005 [63x2]	or just kind of clever REBOL mixture :-)
Chris 28-Oct-2005 [65]	I don't think the DOM should be as complex as you suggest.
Pekr 28-Oct-2005 [66]	noone said we have to develop 1:1 solution ... let's develop one which fits the need best ...
Volker 28-Oct-2005 [67]	actually that description favors DOM. First, we dont want to save memory, we are scripters. We use load too.. Second, we are not strongly typed (they mean static typed). SO we can happily be generic.
older newer	first last