r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

Chris
3-Dec-2008
[593x10]
>> load-xml "<try>This</try>"
== [
    <try> "This"
]
response: context [
	status: name: value: none
]

example: {<rsp>
	<status>Good</status>
	<payload>
		<value name="one">two</value>
	</payload>
</rsp>}

probe make response [
	parse load-xml example [
		<rsp> into [
			<status> set status ["Good" | "Bad"]
			<payload> into [
				<value> into [
					/name set name string! # set value string!
				]
			]
		]
	]
]
All the 'into values are a bit of a pain, but work can be broken 
up...
Note, this parser is destructive - ie. flattening will only provide 
an approximation of the original xml string.
So ymmv depending on need.
; Next, Quick DOM:

do http://www.ross-gill.com/r/qdom.r
Only one method at the moment - get-by-tagname


Note, this is not an attempt to implement W3 DOM.  Just a quick approximation 
for fast manipulation (hence the name).  It's object happy, not sure 
of the weight considerations as such.
do http://www.ross-gill.com/r/qdom.r

doc: load-dom {<some><xml>to try</xml></some>}
values: doc/get-by-tagname <xml>
values/1/value = "to try"
; You can still parse the tree too:
parse doc/tree [<some> into [<xml> "to try"]]
This is not an exercise in bloat, I plan to implement only a few 
key methods.  Though if anyone has any requests?
Chris
4-Dec-2008
[603]
Ok, another revision.  This has a few more methods, I may strip them 
down to read-only, as I don't need to manipulate the object though 
I left them in for completeness.

>> do http://www.ross-gill.com/r/qdom.r
connecting to: www.ross-gill.com
Script: "QuickDOM" (none)
>> doc: load-dom {<some><xml id="foo">to try</xml></some>}
>> foo: doc/get-by-id "foo"
>> foo/name
== <xml>
>> foo/value
== [
    /id "foo" 
    # "to try"
]
>> kids: foo/children
== [make object! [
        name: #
        value: "to try"
        tree: [
            # "to try"
        ]
        position: [
   ...
>> kids/1/value
== "to try"
>> doc/tree/<some>/<xml>/(#)           
== "to try"
Geomol
2-Mar-2009
[604]
RebXML spec: http://www.fys.ku.dk/~niclasen/rebxml/rebxml-spec.html
Scripts are in the Library: http://www.rebol.org
Graham
22-Jun-2009
[605x4]
Has anyone written anything to format/index XML documents?
/indent .. not index
No matter .. it was easy enough.
Now has anyone written a recursive routine to turn a rebol object 
into XML?  I couldn't find anything like this on rebol.org yet it 
doesn't sound hard to do ...
Gregg
22-Jun-2009
[609]
There must be something, but I don't have anything here that turned 
up, and I don't remember doing one myself. If it helps, you could 
use the JSON converter in %json.r as a starting point.
Graham
22-Jun-2009
[610x5]
This seems to work for me ...

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element newline ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out [ o newline ]
		]		
		repend out [ to-tag join "/" element newline ]
	]
]
using this 

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element newline ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out [ o newline ]
		]		
		repend out [ to-tag join "/" element newline ]
	]
]
crap ... clipboard bug
>> probe obj
make object! [
    a: "testing"
    b: "again"
    c: make object! [
        d: "testing2"
        e: "again2"
        f: make object! [
            g: "testing3"
            h: "again3"
        ]
    ]
    i: "finished"
]
gives this 

<a>
    testing
</a>
<b>
    again
</b>
<c>
    <d>
        testing2
    </d>
    <e>
        again2
    </e>
    <f>
        <g>
            testing3
        </g>
        <h>
            again3
        </h>
    </f>
</c>
<i>
    finished
</i>
Steeve
22-Jun-2009
[615]
Hmm..
Really, have you the tabulations ?
Graham
22-Jun-2009
[616x2]
Yes, separate script does the tabulations
probably should change line 
repend out [ o newline ]
to 
repend out [ any [ o copy "" ] newline ]
Steeve
22-Jun-2009
[618]
then, for NONE! values, it will  add an empty line
Graham
22-Jun-2009
[619x3]
format-xml: func [ xml
    /local out space prev
][
    out: copy ""
    spacer: copy ""
    prev: copy </tag>
    foreach tag load/markup xml [
        either tag = find tag "/" [
            ; we have a close tag
            

            ; reduce the spacer by a tab unless the previous was an open tag
            either not tag? prev [
                ; not a tag
                remove/part spacer 4
            ][
                ; is a tag
                if prev = find prev "/" [
                    ; last was a closing tag
                    remove/part spacer 4
                ]
            ]
        ][ 
            either tag? tag [
                ; current is tag
                ; indent only if the prev is not a closing tag
                if not prev = find prev "/" [
                    insert/dup spacer " " 4
                ]
            ][
                ; is data
                insert/dup spacer " " 4 
            ]
        ]
        repend out rejoin [ spacer tag newline ]
        prev: copy tag
    ]
	view layout compose [ area (out) 400x400 ]
]

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out any [ o copy "" ]
		]		
		repend out [ to-tag join "/" element ]
	]
]
remove the newlines to solve that issue :)
I was using rebelxml to construct xml ... but I came across some 
bugs.  So this way of doing it looks easier ....
Graham
23-Jun-2009
[622]
What are people using to construct large XML documents ... of 100s 
of lines?
Maxim
23-Jun-2009
[623x4]
a modified version of John's rebXML  tools.  changed the output structure 
to allow rebol's path notation to be used to traverse the loaded 
xml.
I also replaced the use of url for the tag words because they fail 
when using namespaced xml elements.
building output objects instead would be simple, but the RAM/Speed/symbol 
table implications of all the binding involved makes this un-optimal.
(...objects instead of  blocks ....)
Graham
23-Jun-2009
[627]
So, you used blocks instead of objects?
Maxim
23-Jun-2009
[628x2]
yes all the time.  accessing is exactly the same as for objects. 
 its actually much more flexible.
cause you have have the same element severall times, which is valid 
xml, but invalid in contexts.
Graham
23-Jun-2009
[630]
True
Maxim
23-Jun-2009
[631]
and you can easily separate attributes from elements, just by affecting 
them to different types.
Graham
23-Jun-2009
[632x3]
Although the XML I'm dealing with doesn't have duplicate elements.
Or name spaces
Have you posted your modifications anywhere??
Maxim
23-Jun-2009
[635x3]
the most stable engine I built which accepted all xml possibilities 
ended loading xml like so:


[ <element> [<subelement> [#attribute  "attr-value" . "subelement 
content"]]]
the .  is assigned the value of the elements. 

the above would result from the following XML:
<element>
	<subelement attribute="attr-value">
		subelement content
	</subelement>
</element>
And you can access it this way:

document/<element>/<subelement>/#attribute
document/<element>/<subelement>/.
Graham
23-Jun-2009
[638x2]
I find working with objects much easier though ...
I guess the duplicate elements could be solved by using blocks for 
them
Maxim
23-Jun-2009
[640]
I've never posted that specific version cause it was closed source 
for a client.   but I have my own new engine, which does the same, 
but attacking the parse rules directly... its probably faster.
I've not released it.
BrianH
23-Jun-2009
[641]
Really? I went positional: ["element" "namespace" ["attribute" "value"] 
["subelement" ...] "text" ...]

with missing namespace or attribute block being #[none], so defaults 
can be done with ANY.
Graham
23-Jun-2009
[642]
Using tags looks ugly :)