World: r3wp
[XML] xml related conversations
older newer | first last |
Graham 2-Nov-2005 [227x3] | At present I fudge it by asking initiators to send a userid with each request so I can determine their rights. |
But if someone had the source, they can pretend to be an administrator, and start to damage my data :( | |
So, when are you guys going to complete LNS ?? | |
JaimeVargas 2-Nov-2005 [230x2] | Nope. There is a way around this. Without your fudge. |
Complete LNS when we have the final spec. | |
Graham 2-Nov-2005 [232] | Well, I would like to know as otherwise it's very insecure at present :( |
JaimeVargas 2-Nov-2005 [233] | (Continued in the !Beer group...) |
Pekr 5-Nov-2005 [234x2] | look at Technews - E4X for Javascript - I like the simplicity they integrated it into the language .... |
We should choose the way of how we integrate it into rebol - SAX, DOM, other ... | |
CarstenK 6-Nov-2005 [236x2] | Doing my first steps with REBOL I tried to do something with XML (reading/eventually modifing/writing). I looked for some scripts helping me to do this and found: 1. xml2rebxml/rebxml2xml: I got the following problems: - missing/loosing comments - missing/loosing elements - that's realy serious my steps were: my-doc: xml2rebxml read %simple.xml write %simple2.xml rebxml2xml my-doc The second documents finishes outputting elements after some comment block in the source xml doc. 2. xml-parse/xml-object: The versions I found on the reb library didn't work, I used some older versions from rebXR-1.3.0, I've got my objects, but it would be nice to have a third module like xml-write to get the object tree back to xml. Is somebody developing something like this? 3. mt.r: I tried to figure out how it works. Basically I can write some XML based on a REBOL block but I couldn't figure out how to define the rules about elements and attributes. Where can I find an example about writing for instance svg with mt.r, how looks the coresponding REBOL block and the rules for svg? Where can I find more about xml and REBOL, I think it would be very nice to have some REBOL scripts, doing things like some-elem: xml-create [ elem "foo" namespace "myns" attribs [ bar "something" xyz "123"] ] xml-modify [ elem another-elem append some-elem ] and finally xml-write %mynewxml.xml my-doc Is somebody developing something like this with REBOL? Some scripts giving me the same comfort in REBOL like maybe XOM (http://www.xom.nu) is giving for XML in Java. Of course done with some nice REBOL dialects? What is the above mentioned "EasyXML" - is it available for use/testing? Thank you for any tips, carsten |
One more thing about XOM: E.R Harold has collected a lot of test XML files with many sophisticated XML things that can happen regarding to the XML 1.0 specs. | |
Geomol 6-Nov-2005 [238x2] | Carsten, xml2rebxml should be able to handle comments. Are you sure, your simple.xml is valid xml? |
By "handle", I mean parse them, but comments ain't in the output. The script shouldn't stop for valid XML input. | |
CarstenK 6-Nov-2005 [240] | I played around with some shorter XML document, to figure out, how it works - my REBOL experiences are from last week, so maybe I'm doing something wrong. The comments will be parsed and the block looks also complete but during writing it stops after an element that is followed by some comments. So far as I have seen these comments are left out in the block but there are a lot of whitespaces between the last printed element and the next missing element. |
Geomol 6-Nov-2005 [241] | Carsten, yes, I get the same problem here. I'll look into it. |
CarstenK 6-Nov-2005 [242] | cool, thank you for your time! |
Geomol 6-Nov-2005 [243x3] | Carsten, ok I found a bug related to multiple comments after each other. Get fixed script here: http://home.tiscali.dk/john.niclasen/rebxml/xml2rebxml.r |
Carsten, the script still strip comments. Do you need the comments to be lead through to the output? (I'm a bit in two minds about, how it should work.) | |
I've uploaded the script to the library. | |
Pekr 7-Nov-2005 [246] | taken from ML - http://www.xom.nu |
CarstenK 7-Nov-2005 [247] | I will try the new xml2rebxml.r, I think it would be nice to preserve the comments. If somebody writes xml in a text editor and makes some annotations, so it its nice, if he gets these comments back after processing the files with some other (REBOL) tool. But this feature has some lower priority. I found some more thing in xml2rebxml.r, only the entities replace/all att-data ">" #">" replace/all att-data "<" #"<" replace/all att-data "&" #"&" will be replaced, the other two are missed, I think: replace/all att-data """ #"^"" replace/all att-data "'" #"'" |
Pekr 7-Nov-2005 [248x2] | at xom.nu, you can find various articles too ... |
What is wrong with XML apis - http://www.artima.com/intv/xmlapis.html | |
Geomol 7-Nov-2005 [250] | Carsten, you're right about the " and '. As I read the DTD (http://www.w3.org/TR/2004/REC-xml-20040204/), those can only be found in attribute values (see [10] AttValue), not in character data (see [14] CharData). Is that correct? |
Pekr 7-Nov-2005 [251x3] | http://www.artima.com/intv/dom.html- The Good, the bad and the DOM - "a camel is a horse designed by committee" :-) |
I seem to like XOM, at least upon what author says about it - of course, he eventually might be biased towards his own work - http://www.artima.com/intv/xomdesign.html - if it is true that simplicity was his motivation, then we could look into XOM as possible way to go ... | |
hmm, not so easy and small anyway ... probably the best aproch will be to decide what direction we go and then starting to build rebol-oriented solution, not trying to port something. Looking at some stuff it seems to me sometimes it is designed to fit target language, e.g. java .... | |
MichaelB 7-Nov-2005 [254] | For sure we shouldn't try to simply port something. But maybe it's anyway better to see what Christophe (Coussement) is doing (or his team). But XOM as a base for ideas might not be bad, as it's well designed based on some simple principles which I would sign at least. But it's completely object oriented, so there might be a more Rebol like way to go - don't know. What I would be interested to know is how Christophe is going to handle Unicode files? There are some scripts to help converting utf8 and the like, but I can'f oversee right now how well this will work. |
Pekr 7-Nov-2005 [255] | I liked the discussion Chris and Brian hold here week or so ago ... simply let's find a way of how to work with XML in rebol - once we know what do we want, we can start coding ... |
MichaelB 7-Nov-2005 [256x2] | As Christophe told on the mailinglist - we actually need both SAX and DOM, because if you have a large document and are only interested in a sequence of appearings of elements one at a time, you don't need DOM, but if you need information about the overall structure of a document you have to read in the whole document and that's DOM. But if Christophe is doing DOM already - don't know to what extend - this would be very nice and might be ok for now. |
Would it make sense to have XML files be represented as a port like xml:// . This could make sense for DOM and for SAX. But please correct me if that's stupid. For SAX this would enable one to copy from the port and get events by copying, for some one could navigate with some dialect and position the cursor in the document. A copy would read the data at the current positon - but then a block or something which represents an element could be returned. But I guess that's not well thought out. :-) | |
Geomol 7-Nov-2005 [258] | Carsten, I've added suport for " and ' in xml2rebxml. I've also added preservation of comments, if xml2rebxml is called with /preserve refinement (just call it like: xml2rebxml/preserve <xml code>). I've uploaded the scripts to my page: http://home.tiscali.dk/john.niclasen/rebxml/ I think, they need some testing, before they go to the library at www.rebol.org. |
CarstenK 7-Nov-2005 [259x2] | John, I've downloaded it from your website - thank you! One more question from an unexperienced REBOL-user: What is the most commen way to enhance a block I've got with xml2rebxml, source is <?xml version="1.0" encoding="iso-8859-1"?> <chapter id="ch_testxml" name="Test XML"> <title>A chapter with some xml tests</title> <sect1 id="sct_about" name="About my Tests"> <title>What kind of tests I will do</title> <body> <para>Some simple paragraph.</para> </body> </sect1> </chapter> After read in the file with my-doc: xml2rebxml read %test.xml I'd like to insert a second sect1-element in the block my-doc, whats the best way - just to avoid some stupid mistakes. |
To Michael: I'm not sure if need DOM and SAX, there problem is, that the commitee tried to develop language independant interfaces - so both APIs have problems in the targeted programming language. DOM is inefficient, and you should avoid it. The best way seems to be: 1. have a parser like SAX with events 2. build the model in the best way for your language 3. provide a API for your language Basically XOM does it for JAVA very well, E.R.H. uses a SAX parser and converts to its own object model that is optimized for java. For REBOL this should be something like a block, I think. (Blocks are best way to store things in REBOL ?). But thats internal side of the the tool and could be the rebxml block structure. As api there should be a dialect, maybe one that uses a port (there I have less knowledge - have to learn about this). | |
Geomol 7-Nov-2005 [261] | Carsten, to insert second sect1, do something like: append last my-doc [sect1 id "sct_about" name "Another about" [title "etc....."]] |
Pekr 7-Nov-2005 [262] | Thanks Carsten, that clarifies things clearly to me .... I like Sax aproach more too .... IIRC Gavain's stuff was Sax like too ... it just could not write back to XML ... |
Christophe 7-Nov-2005 [263x5] | Well this is a great place to learn ! |
Pekr: I do not know XOM, i will study it. Maybe it fits beter than our idea of DOM. | |
MichaelB: about unicode handling. That's a point we didn't think about, because we're working in iso-8859-1 (western european) and not utf-8 or-16. So we've to see what would the cost be of it. If here is any suggestion about how to handle this, those are mostly welcome ! (I handled a similar problem with a simple replace/all, but i don't know if it's the best approach) About a port-approach... What should be the advantages ? | |
Geomol: you've done a great job with your rebxml. But we really need some kind a dialect to easilly acces nested data. Like Xpath... I need to be able to say get-data [//*/bbb/ccc[@id='geek']] and get the info. I think xpath have a great notation for that (and a standard). So e have to find the format wich best fit this dialect... | |
I was fighting today to find the best internal data format. Out of the tests seems object! the most performant when using nested data structure. hash! when not nested. but the problem with object! is that we cannot have a recurrent element in the structure, like: <aaa> <bbb>content</bbb> <bbb bbb_attrib="attrib1"></bbb> </aaa> because, of course, when evaluated the last definition of bbb overrides the others. So, we are trying to work with hash! We got a little diminution of the overhead comparing to XML, but the processing time compare to block! seems from 10 to 20% more. I need some more tests about data retrieving in the structure to find the right combination; Any suggestion is welcome ! | |
Volker 7-Nov-2005 [268] | A rough idea: Maybe like vid does it? /color /colors ? it puts the first color in color if there is only one. if there are more, they are put in /colors-block . |
Christophe 7-Nov-2005 [269] | I do not get where you gain in performance? Or do i get it wrong ? |
Volker 7-Nov-2005 [270x3] | because you can use an object as long as there is only one value. But not sure if that helps. |
but 10-20% is not much anyway. | |
And with blocks there is a better chance to use rebcode? | |
BrianH 7-Nov-2005 [273] | Or for that matter, block parsing. |
Christophe 7-Nov-2005 [274x2] | Volker: i got your point. I don't know yet. I will study it tomorrow. |
rebcode could be an issue. But still under development .. | |
Gregg 7-Nov-2005 [276] | Should this group be web public? |
older newer | first last |