r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

Maxim
23-Jun-2009
[626]
(...objects instead of  blocks ....)
Graham
23-Jun-2009
[627]
So, you used blocks instead of objects?
Maxim
23-Jun-2009
[628x2]
yes all the time.  accessing is exactly the same as for objects. 
 its actually much more flexible.
cause you have have the same element severall times, which is valid 
xml, but invalid in contexts.
Graham
23-Jun-2009
[630]
True
Maxim
23-Jun-2009
[631]
and you can easily separate attributes from elements, just by affecting 
them to different types.
Graham
23-Jun-2009
[632x3]
Although the XML I'm dealing with doesn't have duplicate elements.
Or name spaces
Have you posted your modifications anywhere??
Maxim
23-Jun-2009
[635x3]
the most stable engine I built which accepted all xml possibilities 
ended loading xml like so:


[ <element> [<subelement> [#attribute  "attr-value" . "subelement 
content"]]]
the .  is assigned the value of the elements. 

the above would result from the following XML:
<element>
	<subelement attribute="attr-value">
		subelement content
	</subelement>
</element>
And you can access it this way:

document/<element>/<subelement>/#attribute
document/<element>/<subelement>/.
Graham
23-Jun-2009
[638x2]
I find working with objects much easier though ...
I guess the duplicate elements could be solved by using blocks for 
them
Maxim
23-Jun-2009
[640]
I've never posted that specific version cause it was closed source 
for a client.   but I have my own new engine, which does the same, 
but attacking the parse rules directly... its probably faster.
I've not released it.
BrianH
23-Jun-2009
[641]
Really? I went positional: ["element" "namespace" ["attribute" "value"] 
["subelement" ...] "text" ...]

with missing namespace or attribute block being #[none], so defaults 
can be done with ANY.
Graham
23-Jun-2009
[642]
Using tags looks ugly :)
Maxim
23-Jun-2009
[643]
note that in the above, you can replace types within so it could 
be words instead of tags.
BrianH
23-Jun-2009
[644]
My positional version handles multiple subelements of the same type, 
and using strings rather than words lets you use tags that don't 
match word syntax or are case=sensitive.
Maxim
23-Jun-2009
[645x2]
its just that in my tests, either you can create, read or set some 
of the datatypes via path notation.  so only string based types allow 
full XML qualification.
hahaha
Graham
23-Jun-2009
[647]
XML can be case sensitive??
Maxim
23-Jun-2009
[648]
but brian, how do you acess it?
Graham
23-Jun-2009
[649]
by position!
BrianH
23-Jun-2009
[650x2]
XML *is* case-sensitive. Your paths can't access multiple subelements 
of the same type, or embedded text.
I wrote a simple xpath compiler too (but don't know where it is now).
Maxim
23-Jun-2009
[652]
I wanted direct access to all elements within rebol.
Graham
23-Jun-2009
[653]
Looks like we need an article on best practices here ...
Maxim
23-Jun-2009
[654]
a later version, using schema validation, understands multiple subelements 
and automatically converts them to blocks IIRC.

so you do document/element/3/subelement/#attribute.
BrianH
23-Jun-2009
[655]
Your paths can't access multiple subelements of the same type, or 
embedded text. It might have worked for that customer but not the 
general case. No namespace support either.
Maxim
23-Jun-2009
[656]
my paths.. namespace works... for sure.  did you know you can have 
colon in word names in R2 !   but i didn;t use that, I just used 
tags directly.  more obvious than strings, and the exact same effort 
and speed.
BrianH
23-Jun-2009
[657]
I was parsing xhtml and other XML of the like. Subelements of mixed 
types in order with text between them than mattered.
Maxim
23-Jun-2009
[658x3]
my engine does support embedded types, but ignored it by default... 
it was also byte reversible... a loaded xml block loaded through 
the engine was saved back exactly, byte for byte, checksum proofed.
oops.. embedded text I mean.
but all of that is not open source.
BrianH
23-Jun-2009
[661]
And mine is lost :(
Maxim
23-Jun-2009
[662x2]
my newer version doesn't have the schema validation process.... that 
is a very complex engine to build.   schemas and Parse traversal 
do not follow the same algorythm... so its a bitch to implement.
anyhow... the xml tools I currently have are not yet tested enough 
to be release ready.
Graham
23-Jun-2009
[664]
Ok, so we have established that everyone does it their own way :)
Maxim
23-Jun-2009
[665]
but the rebxml tools (on rebol.org) as-is are very usefull, so some 
utf-8 support and are less buggy than rebelXML in my previous tests.
Graham
23-Jun-2009
[666x4]
rebelxml gets confused if there is more than one path with the same 
name
it accepts the first one it finds
even if the full path is specified.
I suspect the author is no longer reboling
Maxim
23-Jun-2009
[670]
I remeber it also tripping on some of the XML files I gave it... 
don't remember the problems... but its error handling /recovery was 
very shaky IIRC.
Graham
23-Jun-2009
[671x2]
You should add a comment to the discussion page if you remember.
as should i.
Maxim
23-Jun-2009
[673]
read above,  I don't  ;-)
Graham
23-Jun-2009
[674x2]
at this time .. it may come back to you
that's why I said "if" ...